ACEsuit / mace

MACE - Fast and accurate machine learning interatomic potentials with higher order equivariant message passing.
Other
511 stars 193 forks source link

better default for device used to _save_ model in run_train #635

Open bernstei opened 2 weeks ago

bernstei commented 2 weeks ago

Right now mace_run_train by defaults saves gpu-format models, unless --save_cpu is explicitly passed. However, those cannot be converted to cpu format without a gpu machine, and going the other way is always possible. I think it would be more helpful to by default save to cpu format.

One option would be to default to saving a cpu model, and replacing the flag with --save_gpu. Another, more versatile option might be to do something like

add_argument("--save_device", choices=["cpu", "gpu"], default="cpu")
wcwitt commented 2 weeks ago

Related: https://github.com/ACEsuit/mace/pull/130.

ilyes319 commented 2 weeks ago

Somehow I never merged it because I was scared it would crash everyone's training. I think Noam's way is better. I will leave the save_cpu for now and just depricate it and add Noam's way as default.