Restarting is not the same as a single training run

lab-cosmo / metatrain

Training and evaluating machine learning models for atomistic systems.

https://lab-cosmo.github.io/metatrain/

BSD 3-Clause "New" or "Revised" License

13 stars 3 forks source link

Restarting is not the same as a single training run #283

Open frostedoyster opened 3 days ago

frostedoyster commented 3 days ago

Using SOAP-BPNN, restarting from a checkpoint does not afford exactly the same numbers as a longer training run. (Training is good however and the numbers make sense). The epoch saved inside the checkpoint is also wrong for the final checkpoint (but fine for the others).