Closed Delaunay closed 1 year ago
I suggest we close this @breuleux
Indeed, this was part of the original vision but it is now out of scope. Milatools is the library to install on the user's machine and libraries that should be run on the cluster will be in different projects.
Checkpointing is critical for modern HPO (Hyperband & ASHA resume training of experiments) and multi-gpu/multi-node for resiliency when a worker drop.