kube-HPC / hkube

🐟 High Performance Computing over Kubernetes - Core Repo 🎣
http://hkube.io
MIT License
307 stars 20 forks source link

Useablity and major features #973

Closed brainoom closed 2 years ago

brainoom commented 4 years ago
  1. Python does not have a standard install method. You should allow having a list of steps to add in the build. You can have a default of pip install -r requirements.txt or pip install . but it must be configurable.

  2. Keep run artifacts. data created while running that is saved to a specified folder (idealy folder name set by user) should be persisted.

    It should be accessible while running - at least be able to download them maybe also see the file names. If you plan on a "data" feature, then it should be saved as "data" for use in other algorithms.

  3. Data feature - the basic should be - to be able to upload a folder and mount it in a user requested location. The folder can then be "attached" as input to an algorithm. In the run UI there should be a clear indication of the input folders. In addition to input folders, you should also be able to give input urls - a list of urls that will be treated by the algorithm as input,
    and given in the input parameters. It can be done currently, but a UI that is aware to that, and keep these type of inputs like the folders would be nice.

    An uploaded folder should have some (searchable) metadata. The metadata should include at least: free tags, predefined categories (org. group etc.), acquisition time, upload time, free text description etc.

  4. Comment on run - user should be able to add some free text on each algorithm run. like "this run was garbage, because I set alpha to 0.1 instead of 0.001".

    Better - also add user defined tags to the run like: "deployed", "only cars" etc. Best - be able to search.

  5. Should have an option to run on my machine but record the run in hkube, or on any other machine for that matter.

    Run should be logged in hkube as-if it was run in a computer managed by hkube. for example - have an option to do something like hkubectl exec --local . To run on the local machine.

    While running, the output folder and tensorboard folder would be watched and uploaded, as well as the logs.

    This is very important in order to be able to develop the single algorithm on local machine first, and then integrate it to a pipeline. This way I can use any compute I have (e.g. a large compute cluster not part of hkube) for training, and record all experiments, and then run the same algorithm on hkube for inference.

  6. Azure devops integration as git provider.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.