interTwin-eu / itwinai

Advanced AI workflows for digital twins applications in science.
https://itwinai.readthedocs.io
MIT License
11 stars 5 forks source link

Resume distributed training from checkpoint #174

Open matbun opened 1 week ago

matbun commented 1 week ago

Add to the TorchTrainer the ability to resume after a crash