openforcefield / openff-benchmark

Comparison benchmarks between public force fields and Open Force Field Initiative force fields
MIT License
11 stars 2 forks source link

Revision pathway #15

Open dotsdl opened 3 years ago

dotsdl commented 3 years ago

We need a revision pathway for datasets, i.e. add/remove molecules, after a dataset has been fully processed by the pipeline. We need to come up with scenarios, and possible ways revisions could work in this pipeline.

We are looking for ideas here, so please share your thoughts on the ways in which revisions may be desired, and how we might accommodate them.

dotsdl commented 3 years ago

I'll shepherd this forward, but looking for ideas from others.

davidlmobley commented 3 years ago

What are the use cases you have in mind? We ran the wrong molecule/wrong stereochemistry type issues?

It could be sometimes we might need to revise to get more data, eg if we wanted to re-run the calculations with a different basis sset/level of theory, or to store wavefunction info, to incorporate new features/properties we can compute for QCArchive, etc.

Could we just have datasets be versioned? Is there a way to add metadata describing what's in this version/what's different in this version?