Closed dotsdl closed 3 years ago
@ldamore could I get your review on this one? This is meant to address the issue raised by Xavier:
$ openff-benchmark report compare-forcefields --input-path 4-compute-qm --input-path 4-compute-mm --ref-method b3lyp-d3bj --output-directory 5-compare_forcefields
Reading files: 100%|| 2/2 [37:33<00:00, 1126.78s/it]
Checking input: 100%|| 8/8 [00:00<00:00, 528.73it/s]
Checking input: 0%| | 0/8 [00:00<?, ?it/s]/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/openff/benchmark/analysis/analysis.py:194: UserWarning: Not all conformers of method b3lyp-d3bj considered, because these are not available in other methods.
warnings.warn(f"Not all conformers of method {m} considered, because these are not available in other methods.")
Checking input: 12%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 | 1/8 [00:00<00:00, 7.21it/s]/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/openff/benchmark/analysis/analysis.py:194: UserWarning: Not all conformers of method openff-1.1.1 considered, because these are not available in other methods.
warnings.warn(f"Not all conformers of method {m} considered, because these are not available in other methods.")
/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/openff/benchmark/analysis/analysis.py:194: UserWarning: Not all conformers of method opls3e_default considered, because these are not available in other methods.
warnings.warn(f"Not all conformers of method {m} considered, because these are not available in other methods.")
/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/openff/benchmark/analysis/analysis.py:194: UserWarning: Not all conformers of method openff-1.0.0 considered, because these are not available in other methods.
warnings.warn(f"Not all conformers of method {m} considered, because these are not available in other methods.")
/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/openff/benchmark/analysis/analysis.py:194: UserWarning: Not all conformers of method smirnoff99Frosst-1.1.0 considered, because these are not available in other methods.
warnings.warn(f"Not all conformers of method {m} considered, because these are not available in other methods.")
Checking input: 100%|| 8/8 [00:00<00:00, 45.88it/s]
Finding reference molecules: 100%|| 817/817 [00:01<00:00, 794.54it/s]
Referencing energies: 100%|| 817/817 [00:01<00:00, 589.27it/s]
Referencing energies: 100%|| 817/817 [00:01<00:00, 637.55it/s]
Calculating RMSD: 4315it [03:08, 22.86it/s]| 745/817 [00:01<00:00, 737.70it/s]
Calculating TFD: 4315it [00:26, 163.48it/s]
Referencing energies: 100%|| 817/817 [00:01<00:00, 643.71it/s]
Calculating RMSD: 4315it [03:09, 22.83it/s]| 746/817 [00:01<00:00, 749.46it/s]
Calculating TFD: 4315it [00:26, 163.21it/s]
Referencing energies: 100%|| 817/817 [00:01<00:00, 640.70it/s]
Calculating RMSD: 4315it [03:12, 22.46it/s]| 746/817 [00:01<00:00, 747.16it/s]
Calculating TFD: 4315it [00:27, 158.00it/s]
Referencing energies: 100%|| 817/817 [00:01<00:00, 626.74it/s]
Calculating RMSD: 4315it [03:10, 22.60it/s]| 723/817 [00:01<00:00, 714.82it/s]
Calculating TFD: 4315it [00:26, 163.63it/s]
Referencing energies: 100%|| 817/817 [00:01<00:00, 636.85it/s]
Calculating RMSD: 751it [00:14, 50.60it/s]| 734/817 [00:01<00:00, 735.95it/s]
Processing data: 50%|| 4/8 [14:49<14:49, 222.34s/it]
Traceback (most recent call last):
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/bin/openff-benchmark", line 8, in <module>
sys.exit(cli())
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/openff/benchmark/cli.py", line 759, in compare_forcefields
analysis.main(input_path, ref_method, output_directory)
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/openff/benchmark/analysis/analysis.py", line 203, in main
calc_rmsd(dataframes[ref_method], dataframes[m])
File "/pstore/apps/.testing/OpenForceField/0.8.4rc1-benchmark/lib/python3.7/site-packages/openff/benchmark/analysis/analysis.py", line 54, in calc_rmsd
result.loc[i, 'rmsd'] = rdMolAlign.GetBestRMS(row['mol'].to_rdkit(), result.loc[i, 'mol'].to_rdkit())
RuntimeError: No sub-structure match found between the reference and probe mol
We don't have a test module in place for the analysis
modules yet. I performed an ad-hoc test on this feature by swapping conformers from two different molecules in one of my MM results (openff-1.1.1
) from the burn-in set.
I get, as expected:
$ openff-benchmark report compare-forcefields --input-path 4-compute-qm-filtered --input-path 4-compute-mm-filtered --ref-method b3lyp-d3bj --output-directory 5-compare_forcefields
...
Unable to calculate best RMSD between b3lyp-d3bj and openff-1.1.1; conformer `TST-00110-00`
Unable to calculate best RMSD between b3lyp-d3bj and openff-1.1.1; conformer `TST-00095-07`
WARNING: The reference mol TST-00110-00 and query mol TST-00110-00 do NOT have the same SMILES strings as determined by RDKit MolToSmiles.
[H]c1c(OC([H])([H])[H])nc(N([H])C(=O)N([H])S(=O)(=O)C([H])([H])c2c([H])c([H])c([H])c([H])c2C(=O)OC([H])([H])[H])nc1OC([H])([H])[H]
[H]N(c1nc(SC([H])([H])[H])nc(N([H])C([H])(C([H])([H])[H])C([H])([H])[H])n1)C([H])([H])C([H])([H])C([H])([H])[H]
- WARNING: The reference mol TST-00095-07 and query mol TST-00095-07 do NOT have the same SMILES strings as determined by RDKit MolToSmiles.
[H]N(c1nc(SC([H])([H])[H])nc(N([H])C([H])(C([H])([H])[H])C([H])([H])[H])n1)C([H])([H])C([H])([H])C([H])([H])[H]
[H]c1c(OC([H])([H])[H])nc(N([H])C(=O)N([H])S(=O)(=O)C([H])([H])c2c([H])c([H])c([H])c([H])c2C(=O)OC([H])([H])[H])nc1OC([H])([H])[H]
And the contents of the output file have NaN
s in the expected places:
$ cat 5-compare_forcefields/openff-1.1.1.csv
name,group_name,molecule_index,conformer_index,rmsd,tfd,dde[kcal/mol]
TST-00010-00,TST,00010,00, 3.21458367e-02, 1.03306624e-03, 0.00000000e+00
TST-00110-00,TST,00110,00,,, 0.00000000e+00
TST-00116-00,TST,00116,00, 7.56105972e-02, 2.12679387e-03, 0.00000000e+00
TST-00095-07,TST,00095,07,,,-2.08604037e+01
TST-00222-00,TST,00222,00, 2.45894555e-02, 0.00000000e+00, 0.00000000e+00
TST-00082-00,TST,00082,00, 1.14619196e-01, 8.50385549e-03, 0.00000000e+00
TST-00113-00,TST,00113,00, 2.94241533e-01, 1.09174439e-01, 0.00000000e+00
TST-00035-00,TST,00035,00, 3.25742181e-02, 3.97723503e-05, 0.00000000e+00
TST-00038-00,TST,00038,00, 1.48897098e-02, 0.00000000e+00, 0.00000000e+00
TST-00005-00,TST,00005,00, 8.85880694e-02, 3.45953855e-02, 0.00000000e+00
TST-00095-00,TST,00095,00, 2.79291921e-01, 9.11232357e-02, 0.00000000e+00
TST-00176-00,TST,00176,00, 1.37752953e-02,, 0.00000000e+00
TST-00152-00,TST,00152,00, 3.79610075e-02, 0.00000000e+00, 0.00000000e+00
TST-00093-00,TST,00093,00, 1.85977374e-01, 2.62453482e-02, 0.00000000e+00
TST-00243-00,TST,00243,00, 3.82898359e-02, 1.08678752e-02, 0.00000000e+00
TST-00168-00,TST,00168,00, 9.85315256e-02, 5.60296282e-02, 0.00000000e+00
TST-00267-00,TST,00267,00, 2.46011056e-02, 1.28926188e-05, 0.00000000e+00
TST-00003-00,TST,00003,00, 4.39707837e-01, 3.40765089e-01, 0.00000000e+00
TST-00124-00,TST,00124,00, 3.19099507e-01, 6.22147500e-02, 0.00000000e+00
TST-00031-00,TST,00031,00, 4.74017788e-02, 2.20726841e-03, 0.00000000e+00
TST-00004-00,TST,00004,00, 5.83838077e-01, 4.96838304e-03, 0.00000000e+00
TST-00198-00,TST,00198,00, 1.07191934e-01, 2.27226710e-02, 0.00000000e+00
TST-00260-00,TST,00260,00, 3.75503722e-02, 0.00000000e+00, 0.00000000e+00
TST-00036-00,TST,00036,00, 2.59408557e-02, 1.22605281e-05, 0.00000000e+00
TST-00021-00,TST,00021,00, 2.70723674e-02, 2.68451133e-03, 0.00000000e+00
TST-00242-00,TST,00242,00, 2.42163551e-02, 0.00000000e+00, 0.00000000e+00
Description
Added try-except for rdMolAlign call, NaN injection on failure with message.
This is in response to an issue Xavier Lucas ran into in which a failure to calculate the RMSD by RDKit causes the whole analysis to fail.
This change makes the analysis tolerant of failures at this point, while giving an informative message indicating on which conformer(s) the failure occurred.
Status