kliment-olechnovic / voronota

Voronota is a software tool for analyzing three-dimensional structures of biological macromolecules using the Voronoi diagram of atomic balls.
https://kliment-olechnovic.github.io/voronota/
MIT License
27 stars 3 forks source link

compatibility with 2 letter chain names #5

Open kcollins24 opened 7 months ago

kcollins24 commented 7 months ago

Hello, it's fairly common in large pdbs to have 2 letter chain names (I think up to 3 letters is possible, though I've never seen 3 supposedly they exist), and as far as I can tell voronota only reads in one letter for the chain name (if there are 2, only the first character will be read and the atom numbering can become strange). I'm aware there is some capability to rename the chains, but if only one character is used, there are pdbs that will have more chains than there are single character chain names, so 2 character chain names are sometimes necessary (especially if you are consolidating multiple models into one structure, which voronota does).

Are there any plans to make voronota compatible with 2 letter chain names?

kliment-olechnovic commented 7 months ago

Hello,

Thanks for the question, it is an important issue.

Longer chain names should be supported in case of input in mmCIF format, but the multi-character chain name usage is not tested extensively. Maybe you can provide some PDB IDs where the mmCIF input failed for you?

In case of the PDB input format, there is only one character dedicated to the chain name (according to https://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM) - so I am not sure how longer chain name reading should be implemented in such cases (some example inputs could be very helpful).