openmm / spice-dataset

A collection of QM data for training potential functions
MIT License
152 stars 9 forks source link

Steps to reproduce D3 contribution for each structure #113

Closed YutackPark closed 3 weeks ago

YutackPark commented 3 weeks ago

I want to subtract D3 contribution from each structure, to train machine learning potentials.

In this purpose, could you share D3 related parameters used in SPICE dataset?

It would be very helpful for users interested in training interatomic potentials with the SPICE dataset. Although potentials can be directly trained with D3 combined results, it is more reliable to train on data without D3 contribution and add D3 explicitly in model inference.

peastman commented 3 weeks ago

The original datasets on QCArchive contain a lot more fields than what we included in the HDF5 file. If you want others, you can run the downloader script yourself, first editing the config file to tell it what fields to store. See the Psi4 documentation for descriptions of the fields. The ones you're probably interested in are 'dft functional total energy' (the energy without the dispersion correction) and 'dispersion correction energy'.

Be aware that downloading from QCArchive is very slow! Expect it to take about a day to download the whole dataset.

peastman commented 3 weeks ago

Also 'dispersion correction gradient' if you want the forces.

YutackPark commented 3 weeks ago

Thank you for the detailed instructions. It works perfectly!