determined-ai / determined

Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
https://determined.ai
Apache License 2.0
2.98k stars 349 forks source link

💡[feat] local cluster to use offline docker images #9554

Open KyanChen opened 2 months ago

KyanChen commented 2 months ago

Describe the problem

how to use docker images offline in local cluster. I know that a slurm based cluster can use cached docker images.

Describe the solution you'd like

offer a way to use local cached docker images.

Describe alternatives you've considered

No response

Additional context

No response

ioga commented 2 months ago

hello, if an image is present on the agent, it should automatically use it without redownloading it, just like a normal docker run invocation will.