openmm / pdbfixer

PDBFixer fixes problems in PDB files
Other
443 stars 112 forks source link

Allow initialization from topology, positions, and sequence. #263

Open jonathanking opened 1 year ago

jonathanking commented 1 year ago

This allows the user to fix an object/protein structure already loaded by OpenMM. I have used this in my research so I can fix an already parsed protein structure instead of saving it to a PDB file and reloading it with pdbfixer.

This PR and the one before it, #262, are my first contributions. Please let me know if there is anything I can change, or if there is anything I have misunderstood. These modifications have been useful in my work, and I thought I might share them here as well.

peastman commented 1 year ago

I've been thinking about this, trying to decide whether it's the right approach. It would help if you could describe your use case a bit more. For example, you provide two different ways of specifying the sequence: by one letter or three letter codes. Are both really needed? On the other hand, they only support a single chain. You really need a separate sequence for each chain.

And then, is specifying the sequence really the best approach? It requires logic to align the sequence with the residues, and that isn't always completely reliable. When reading a file we need to do it, because that's what the file contains. But if you're calling it programmatically and you already know what residues you want to add, would it be better to skip the sequence matching step and just have the user set missingResidues directly?