torchmd / torchmd-net

Training neural network potentials
MIT License
335 stars 75 forks source link

Add check_errors option #253

Closed RaulPPelaez closed 10 months ago

RaulPPelaez commented 10 months ago

Adds two new options:

The default values for these previous to this PR are tailored for inference (CUDA-graphs), but we have seen they can hurt performance and increase memory usage during training. Thus, I am defaulting these to maximize training performance. Advanced users can tweak these to squeeze performance during inference. cc @stefdoerr @guillemsimeon

guillemsimeon commented 10 months ago

thanks for this Raul!