openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
305 stars 90 forks source link

Update aa_residues_substructures_explicit_bond_orders_with_caps_expli… #1728

Closed pbuslaev closed 10 months ago

pbuslaev commented 10 months ago

This patch should solve #1727

codecov[bot] commented 10 months ago

Codecov Report

Merging #1728 (7c37650) into main (88ce0b3) will decrease coverage by 11.74%. The diff coverage is n/a.

Additional details and impacted files
pbuslaev commented 10 months ago

I think that the changes I added should lead to generation of the correct json file. I have also updated the json file and fixed a typo in rdkit_wrapper docstring

j-wags commented 10 months ago

Wow - Thanks for diving into our crazy substructure code! The resulting patterns look good visually but I tried loading the PDB file from your original post and it failed with:

E           openff.toolkit.utils.exceptions.UnassignedChemistryInPDBError: Some bonds or atoms in the input could not be identified.
E           
E           Hint: The following residues were assigned names that do not match the residue name in the input, or could not be assigned residue names at all. This may indicate that atoms are missing from the input or some other error. The OpenFF Toolkit requires all atoms, including hydrogens, to be explicit in the input to avoid ambiguities in protonation state or bond order:
E               Input residue  :CYS#0002 contains atoms matching substructures {'No match', 'PEPTIDE_BOND'}
E           
E           Error: The following 5 atoms exist in the input but could not be assigned chemical information from the substructure library:
E               Atom     9 (HA) in residue  :CYS#0002
E               Atom    10 (CB) in residue  :CYS#0002
E               Atom    11 (HB2) in residue  :CYS#0002
E               Atom    12 (HB3) in residue  :CYS#0002
E               Atom    13 (SG) in residue  :CYS#0002

openff/toolkit/utils/rdkit_wrapper.py:818: UnassignedChemistryInPDBError

I tried shaking a few permutations of things like residue names, but to no avail. Are you able to load the original PDB successfully?

In either case, I have an example file and test we can hammer on, but I don't have write access to your fork to send it over. Would you be OK if I merged this into a branch so OpenEye CI stops complaining, and then gave you write access?

pbuslaev commented 10 months ago

Wow - Thanks for diving into our crazy substructure code! The resulting patterns look good visually but I tried loading the PDB file from your original post and it failed with:

E           openff.toolkit.utils.exceptions.UnassignedChemistryInPDBError: Some bonds or atoms in the input could not be identified.
E           
E           Hint: The following residues were assigned names that do not match the residue name in the input, or could not be assigned residue names at all. This may indicate that atoms are missing from the input or some other error. The OpenFF Toolkit requires all atoms, including hydrogens, to be explicit in the input to avoid ambiguities in protonation state or bond order:
E               Input residue  :CYS#0002 contains atoms matching substructures {'No match', 'PEPTIDE_BOND'}
E           
E           Error: The following 5 atoms exist in the input but could not be assigned chemical information from the substructure library:
E               Atom     9 (HA) in residue  :CYS#0002
E               Atom    10 (CB) in residue  :CYS#0002
E               Atom    11 (HB2) in residue  :CYS#0002
E               Atom    12 (HB3) in residue  :CYS#0002
E               Atom    13 (SG) in residue  :CYS#0002

openff/toolkit/utils/rdkit_wrapper.py:818: UnassignedChemistryInPDBError

I tried shaking a few permutations of things like residue names, but to no avail. Are you able to load the original PDB successfully?

This is strange. With updated json substructure file everything works perfectly for me. I do see the exact same SMARTS pattern as I added manually in the generated substructure file, so I am a bit confused why you are getting the error.

In either case, I have an example file and test we can hammer on, but I don't have write access to your fork to send it over. Would you be OK if I merged this into a branch so OpenEye CI stops complaining, and then gave you write access?

Yes, this is fine with me.