kube-HPC / hkube

🐟 High Performance Computing over Kubernetes - Core Repo 🎣
http://hkube.io
MIT License
305 stars 20 forks source link

Allow saving/loading job data to hkube storage through direct api #1296

Closed mattanf2 closed 1 year ago

mattanf2 commented 3 years ago

I need an API that is part of the instance that an algorithm recives that allows saving and loading information to hkube storage. The information should be:

  1. Accessible only within the job that created it
  2. Disapear one the job information disapear
  3. Will be saved and accessed through A small string key
  4. be able to support large binary data

The Story:

I have a case were I have 3 algorithms A, B, and C A generates information from images called features B gathers all the information from A C runs on pairs of image features based on B Algorithm A works on lots of images and generates a lot of data pair image (several megs pair image) If I was to send all the information C requires from A through B. B would crash for lack of memory. I need a way to save the information that A generates to storage. Saving the data should generate some URI that can be sent forward to B. Than have B send only the URI to C. and have C load the information through the Given URI.

up till now this can all be done through saving stuff to external disk/s3/redis. However for both debugging and ease of deployment I need that information to disappear when and only when the rest of the job data is deleted. This means that it will be better if the information is saved to hkube storage along with all other job data

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.