openforcefield / openff-bespokefit

Automated tools for the generation of bespoke SMIRNOFF format parameters for individual molecules.
https://docs.openforcefield.org/bespokefit
MIT License
59 stars 9 forks source link

BaseRecord.extras is optional #369

Open ntBre opened 5 days ago

ntBre commented 5 days ago

I'm running into an issue where this code is crashing with a TypeError: argument of type 'NoneType' is not iterable. Looking at the qcportal.record_models.BaseRecord definition, the extras field is optional.

https://github.com/openforcefield/openff-bespokefit/blob/c9f9054b72ff967f845e40167f5904780e674f5f/openff/bespokefit/optimizers/forcebalance/factories.py#L121

The record causing the issue is a torsion drive record (id 119466792) from my OpenFF Torsion Coverage Supplement v1.0 dataset:

from qcportal import PortalClient
client = PortalClient("https://api.qcarchive.molssi.org:443/")
td = client.get_torsiondrives(119466792)
td.id # => 119466792
td.extras is None # => True
client.query_dataset_records(119466792)[0]['dataset_name'] # => 'OpenFF Torsion Coverage Supplement v1.0'

I think an additional condition needs to be added here like if qc_record.extras and "id" in qc_record.extras. Again, according to the definition I linked above, the id field should also always be present on these records, so this could possibly be simplified to access qc_record.id directly, but that seems like a more ambitious change.

I'm also not entirely sure why this has only popped up now. I've been running fits in this environment:

$ mamba list 'qcsubmit|bespokefit|qcportal'
# Name                    Version                   Build  Channel
openff-bespokefit         0.2.3+62.g3dde8b5          pypi_0    pypi
openff-qcsubmit           0.52.0             pyhd8ed1ab_0    conda-forge
qcportal                  0.56               pyhd8ed1ab_0    conda-forge

and this error happened in this fresh environment:

$ mamba list 'qcsubmit|bespokefit|qcportal'
# Name                    Version                   Build  Channel
openff-bespokefit         0.4.0              pyhd8ed1ab_0    conda-forge
openff-qcsubmit           0.53.0             pyhd8ed1ab_1    conda-forge
qcportal                  0.54.1             pyhd8ed1ab_0    conda-forge

Obviously I've been using quite an old bespokefit version, but this code hasn't been changed for 3 years, since 0.1.0. I didn't check the blame for the whole call stack, though, so some kind of filter or try/except could have been changed higher up. This record is from a fairly new dataset (~February 2024), but I've still used it in previous fits with my old environment.

ntBre commented 5 days ago

After patching the first instance of the error locally,it also pops up on line 279 in the same file.