openforcefield / openff-qcsubmit

Automated tools for submitting molecules to QCFractal
https://openff-qcsubmit.readthedocs.io/en/latest/index.html
MIT License
26 stars 4 forks source link

Tests choking on public QCA `Dataset` "OpenFF BCC Refit Study COH v1.0" #188

Closed dotsdl closed 2 years ago

dotsdl commented 2 years ago

It looks like OpenFF BCC Refit Study COH v1.0 originally had 430 calculations, with 429 complete. However, it now appears to have far more, many of which don't include any computed results at all:

from qcportal import FractalClient

fc = FractalClient()
ds = fc.get_collection('Dataset', 'OpenFF BCC Refit Study COH v1.0')
recs = ds.get_records(**{'method': 'pw6b95', 'basis': 'aug-cc-pv(d+d)z', 'program': 'psi4', 'keywords': 'resp-2-vacuum'})

recs

gives

                                                             record
index                                                              
C1CCCC1-0            ResultRecord(id='32651764', status='COMPLETE')
CC(=O)Oc1ccccc1-0    ResultRecord(id='32651791', status='COMPLETE')
CC(=O)c1ccccc1-0     ResultRecord(id='32651733', status='COMPLETE')
Cc1ccccc1C(=O)C-0    ResultRecord(id='32651759', status='COMPLETE')
CC(=O)c1ccccc1O-0    ResultRecord(id='32651818', status='COMPLETE')
...                                                             ...
C(C#N)SC1=NN=C(N1)N                                             NaN
CCCCSC1=NN=C(S1)N-0                                             NaN
CCCCSC1=NN=C(S1)N-1                                             NaN
CCCCSC1=NN=C(S1)N-2                                             NaN
CCCC1CCCNC1                                                     NaN

[43740 rows x 1 columns]

This is causing this test to fail.

Two questions:

  1. Do we know what happened to this dataset? Where did it get all its new entries?
  2. What do we want to do about it with regard to this test?
dotsdl commented 2 years ago

cc: @SimonBoothroyd, @JoshHorton

SimonBoothroyd commented 2 years ago

A submission with the wrong name accidentally got merged: https://github.com/openforcefield/qca-dataset-submission/pull/265 which resulted in these entries being added to this existing older set it seems unfortunately.

dotsdl commented 2 years ago

Ah gotcha, this makes sense. Considering next steps; I'll make a PR with proposed fixes and we can go from there. Thanks @SimonBoothroyd!