Closed jthorton closed 3 months ago
@jthorton Sorry for the short reply, this is from an airport... The current from_pdbfile
loading is vendored from OpenMM, it's a bit limited and restricted to standard AAs. We're looking to do something smarter eventually... see #135 I've got a pet project here: https://github.com/OpenFreeEnergy/pdbinf
Basically your PDB file is wrong, residues 1007 and 1008 should be resname PTR not TYR. (As a second gripe, CONECT records shouldn't make a difference for residues labelled as ATOM rather than HETATM etc). If I download the cif template from here: https://www.rcsb.org/ligand/PTR I can then load via pdbinf to rdkit, then gufe can ingest this rdkit mol.
Longer term, there's also code I'm playing with that does this residue guessing automatically, see: https://github.com/OpenFreeEnergy/pdbinf/blob/main/notebooks/tpo_guessing_demo.ipynb. This should make it just magically work even if things are labelled incorrectly etc
also this paper/software is somewhat relevant: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-023-00786-w
Thanks @richardjgowers this is testing my system preparation skills and they still need some work! I tried the suggested pathway and it worked great, the demo notebook looks fantastic as well excited to see this be part of gufe!
Ok cool, glad it worked. Yeah one thing holding up switching the backend is we/I have a lot of confidence in the OpenMM vendored bit as it's had a lot of usage, and there's much less usage gone into pdbinf so far; so it's good that it's worked for you on this corner case!
When loading a protein with a modified amino acid, in this case, two phosphorylated tyrosines, the connectivity is incorrectly interpreted as 0 which causes the atoms to be identified as ions and an error is raised. It seems like the bond order here is being set to
BondType.UNSPECIFIED
despite the connect records in the pdb file.jak2_FEP_receptor_RELAXED_capped.txt