openforcefield / yammbs

Internal tool for benchmarking force fields
MIT License
3 stars 1 forks source link

QM and MM conformers are returned as different quantities #67

Open amcisaac opened 4 days ago

amcisaac commented 4 days ago

Problem store.get_qm_conformer_by_qcarchive_id() returns a Quantity object, whereas store.get_mm_conformer_by_qcarchive_id() returns a np.array. This makes it hard to compare them as the user must manually convert them to the same format.

Preferred behavior The MM conformer is returned as a Quantity. The QM conformer could also be returned as an array, but the openff.Molecule object expects a Quantity, so I think that's the easiest.

mattwthompson commented 4 days ago

Whoops - that shouldn't happen!

mattwthompson commented 3 days ago

This is legit, but I spent a good chunk of time looking for the source and I'm having a hard time hunting it down. Please work around this for the time being, sorry.

ntBre commented 3 days ago

@amcisaac Just to make sure, did you load the dataset initially from a cached input file? I think I fixed this in #69 but wanted to double-check that it didn't arise somewhere else. Unfortunately this fix won't help with existing databases, only new ones loaded from a cache anyway, so a workaround will still be needed like Matt said.

I could probably cook up some SQL code to fix this in an existing database if you really wanted, but I'm guessing something like isinstance(coords, Quantity) might be good enough for now?

amcisaac commented 3 days ago

Yes I did, I calculated the store from a cached dataset.

It looks like #69 converts the QM conformer to an array instead of the MM one to a Quantity. I think ideally, both the QM and MM conformers would be Quantity, since to add the conformer to a openff.Molecule, it needs to be a Quantity. But it's definitely better to have them both as an array then have them be different

ntBre commented 3 days ago

Yeah that's right. This is kind of an immediate fix to get them consistent inside the database, and possibly a future change could convert them both to quantities on the way out of the database. I could even see something like MoleculeStore.get_mm_molecule_by_qcarchive_id to return an openff.toolkit.Molecule if this is a common operation.

amcisaac commented 3 days ago

I often want a molecule to come out, that could be a nice additional feature. And yeah you're right, either way the conversion shouldn't be happening in the caching process, but on the way out.