MolSSI / QCElemental

Periodic table, physical constants, and molecule parsing for quantum chemistry.
https://molssi.github.io/QCElemental/
BSD 3-Clause "New" or "Revised" License
143 stars 70 forks source link

Setup qcel.models.Molecule with non-contiguous fragments #298

Open awvwgk opened 1 year ago

awvwgk commented 1 year ago

Trying the following example molecule from S22, I notice that there is something odd happening for molecules with fragment information:

import qcelemental as qcel

data = {
    "symbols": [
        "N", "N", "H", "H", "H", "H", "H", "H"
    ],
    "geometry": [
        -2.9833455085754399E+0, -8.8082052767279997E-2,  0.0000000000000000E+0,
         2.9833455085754399E+0,  8.8082052767279997E-2,  0.0000000000000000E+0,
        -4.0792036056518599E+0,  2.5775116682053001E-1,  1.5298565626144400E+0,
        -1.6052680015564000E+0,  1.2438048124313399E+0,  0.0000000000000000E+0,
        -4.0792036056518599E+0,  2.5775116682053001E-1, -1.5298565626144400E+0,
         4.0792036056518599E+0, -2.5775116682053001E-1, -1.5298565626144400E+0,
         1.6052680015564000E+0, -1.2438048124313399E+0,  0.0000000000000000E+0,
         4.0792036056518599E+0, -2.5775116682053001E-1,  1.5298565626144400E+0
    ],
    "fragments": [
        [0, 2, 3, 4],
        [1, 5, 6, 7],
    ],
}

for validate in (True, False):
    try:
        mol = qcel.models.Molecule(**data, validate=validate)
    except qcel.exceptions.ValidationError as e:
        print(f"Fails to setup with validate={validate}\n{str(e)}")

Output from the run is an unexpected validation error when validating the input

Warning: QCElemental is reordering atoms to accommodate non-contiguous fragments
Fails to setup with validate=True
Error: QCElemental would need to reorder atoms to accommodate non-contiguous fragments

Why are non-contiguous fragments a validation error in qcel? The issue is probably that throw_reorder is always true at:

https://github.com/MolSSI/QCElemental/blob/292350f4cc06ca274a15b5e7101e56f25b0093a9/qcelemental/molparse/from_schema.py#L57

while the default is false at

https://github.com/MolSSI/QCElemental/blob/292350f4cc06ca274a15b5e7101e56f25b0093a9/qcelemental/molparse/from_schema.py#L103

but there seems to be no way to toggle throw_reorder other than skipping the validation step.

loriab commented 1 year ago

Ah yes, in the beginning it was requested that fragments be allowed to be noncontiguous. So the good news is that I wrote the hard part (molparse validator) to accommodate. Then we ran into there being zero downstream programs that could actually work with noncontig. I can probably expose throw_reorder to the pydantic model, but results should be checked carefully at first. So you've got a downstream that can use noncontig? I can see how dispersion on fragments would be a good case, esp. as geometry is fixed.

awvwgk commented 1 year ago

Thanks for the context, yes I tried testing non-contiguous fragments for a workflow which can work with those robustly, but directly hit a hard-stop. Now I work around by handling the fragment information outside of the qcel models because qcel is still more convenient than rolling my own.

loriab commented 1 year ago

Ok, I looked a bit more, and it's the molparse internal representation (aka psi's) that works in fragment separators (requires contiguous) rather than fragment lists. Thus, molparse always contiguizes, and throw_reorder merely toggles between let-it-reorder and stop-in-tracks.

So, an allow_noncontiguous_fragments_but_reorder=True is not too hard (just lifts the throw), but it's not keeping with good qcschema practice (changing up the input for convenience), and it's probably not what your workflow wants, is it?. A proper handling of noncontig would need more extensive renovation.

Is this xtb that can work with non-contig or a workflow w/o a program in particular? For now, your workaround is probably best.

awvwgk commented 1 year ago

I could live with qcel reordering the atoms, just leaving them in their order would be preferred, of course.

Generally, the reordering seems like something psi4 should apply as needed, similar as we handle ghost atoms in DFT-D4, qcel / qcng could provide a convenience function for permuting atoms (together with all associated properties).

loriab commented 1 year ago

There's a PR up @awvwgk , just in case you didn't get pinged.