volkamerlab / opencadd

A Python library for structural cheminformatics
https://opencadd.readthedocs.io
MIT License
91 stars 18 forks source link

mmligner can't read pdb files written by atomium #9

Closed Koesed96 closed 4 years ago

Koesed96 commented 4 years ago

Basically the title.

Full error message:

Parsing Brookhaven PDB      ...4u3y.pdb           ::PDB_PARSE_ERROR:: The parser stumbled into an unrecognized record tyat Line 4860 in File: 4u3y.pdb

::PDB_PARSE_ERROR:: The file: 4u3y.pdb is either NOT a PDB format file, or  does NOT conform to its basic requirements.

The problem is that atomium is shortening lines beginning with TER. For example: Original:

TER 2442 LYS A 311

After atomium parsing and saving:

TER

Parsing was done with: atomium.fetch("4u3y.pdb").model and saving with .save("./4u3y.pdb") After replacing the different line with the original, the file can be parsed untill it reaches the next TER line.

jaimergp commented 4 years ago

What if you fetch from the id directly?

atomium.fetch("4u3y").model
Koesed96 commented 4 years ago

That doesn't fix the problem.

The problem ist that mmligner seems to expect an atom number in every line. I have fixed the problem with a method to add the atom number from the atom above the TER line to the TER line. I'm not sure if thats the solution we should use, but I got an working example for now.

jaimergp commented 4 years ago

Awesome, glad you fixed it! We should report the bug to the atomium devs so the fix it upstream. Do you want to do that and get your open source contributor points? I can also do that, but it'd be a valuable experience if you want to proceed!

Koesed96 commented 4 years ago

If you could give me a small heads up tomorrow, I would like to to that.

jaimergp commented 4 years ago

It would mainly consist of:

  1. Fork https://github.com/samirelanduk/atomium
  2. Create a branch
  3. Edit this line to include the previous atom serial number.
  4. Add, commit, push, create PR, wait for Sam :)
jaimergp commented 4 years ago

Oh, actually, you need to add more fields:

https://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#TER

You can claim it as "Comply to PDB TER records spec"

jaimergp commented 4 years ago

Issue raised here: https://github.com/samirelanduk/atomium/issues/25

jaimergp commented 4 years ago

We have changed to MDAnalysis for the Structure object now, which makes mmligner choke too :/

jaimergp commented 4 years ago

Well, not anymore!