ChemBioHTP / EnzyHTP

EnzyHTP is a python library that automates the complete life-cycle of enzyme modeling
https://enzyhtp-doc.readthedocs.io
Other
8 stars 1 forks source link

Structure parser chain ID issue #98

Open KleinesMesser opened 1 year ago

KleinesMesser commented 1 year ago

If the structure parser has to assign more than 9 new chain IDs, the 10th unique chain ID is 10 and thus making later information on each line move one digit forward, which will cause a problem when reading the resulting PDB.

Example: PDB ID 7tpt. See the screenshot in the attached figure.

Suggested change: Assign chain ID as follows: A-Z, a-z, and 0-9. This way, a maximum of 62 unique 1-digit chain IDs can be generated. And if the input structure has more than 62 unique chains (very extreme case), the get_structure() function can assign an integer starting from 10. But because of the limitation of the PDB format, these chain IDs are not supposed to be printed out by the get_file_str() function.

Besides, the current get_file_str() function output chains by 0-9, A-Z, and a-z. A better order should consistently be A-Z, a-z, and 0-9. And the ordering for later integer chain IDs should be by the variable type of integer instead of a string. 7tpt_output_example

shaoqx commented 1 year ago

Great discussion today! Thank you for the issue!