FAIR-Chem / fairchem

FAIR Chemistry's library of machine learning methods for chemistry
https://opencatalystproject.org/
Other
869 stars 257 forks source link

Finetuning OC20 model with raw DFT energies or adsorption energies? #786

Closed ShaunHan closed 1 month ago

ShaunHan commented 3 months ago

Hi. I know that all OC20 models predicts the adsorption energies of a catalyst. I now have some raw DFT energies and forces, and I want to finetune the EquifomerV2 model. Should I use the raw DFT energies directly for training or should I convert them to adsorption energies first? Thanks in advance.

zulissimeta commented 2 months ago

Hi @ShaunHan - you have a couple choices, both of which I would expect to work ok:

  1. You could write a dataset with total (raw) DFT energies, and fine-tune a checkpoint trained on total energies
    • This is the most general and most useful model after, but note that you should not just check the MAE on total energies, but how well your model works for what you actually care about.
  2. You could reference your energies as adsorption energies, and fine-tune a checkpoint on adsorption energies
    • Note that if you have small numerical errors that might cancel (e.g. k-point convergence errors), this scheme usually works better as the adsorption energies are more well-defined than raw energies.
ShaunHan commented 2 months ago

Thanks @zulissimeta . My question is then how can I tell the code whether my data are raw energies or adsorption energies? Is there a keyword for this for finetuning OC20 models?

github-actions[bot] commented 1 month ago

This issue has been marked as stale because it has been open for 30 days with no activity.