Open zulissimeta opened 1 month ago
Hi Zach,
Thanks for your quick response. Long time no see! This is Bruno from Noa's group speaking. I would be happy contribute to the tutorial. And I believe during my intern last year, I was able to write the data preprocessing script/documentation to convert both QM9 and OE62 data to the LMDBs, I am just write to ask if the scripts and docs are still available. If so, it would make it a lot easier for me to generate the LMDB, make tutorials, and use the ocp models in further applications.
Thanks, Bruno
Hi, I am just following up the previous message. Is there any file that I can refer to when trying to train a molecular property?
Bruno
Hi, I am just following up the previous message. Is there any file that I can refer to when trying to train a molecular property?
Bruno
Sorry I missed this!
To write an ASE LMDB:
from fairchem.core.datasets.datasets.lmdb_database import LMDBDatabase
with LMDBDatabase('my_dataset.aselmdb') as db:
for atoms in atoms_list:
db.write(atoms)
# optionally db.write(atoms, data=atoms.info) if you want to store info as data
Then refer to this https://fair-chem.github.io/core/ase_dataset_creation.html for training. We should definitely iterate on this!
This issue has been marked as stale because it has been open for 30 days with no activity.
https://github.com/FAIR-Chem/fairchem/issues/787 highlights that our docs have a hole for users who want to train on molecule properties with custom outputs like homo-lumo gaps.
We should add a simple example to the tutorials, perhaps: