If the structure parser has to assign more than 9 new chain IDs, the 10th unique chain ID is 10 and thus making later information on each line move one digit forward, which will cause a problem when reading the resulting PDB.
Example: PDB ID 7tpt. See the screenshot in the attached figure.
Suggested change: Assign chain ID as follows: A-Z, a-z, and 0-9. This way, a maximum of 62 unique 1-digit chain IDs can be generated. And if the input structure has more than 62 unique chains (very extreme case), the get_structure() function can assign an integer starting from 10. But because of the limitation of the PDB format, these chain IDs are not supposed to be printed out by the get_file_str() function.
Besides, the current get_file_str() function output chains by 0-9, A-Z, and a-z. A better order should consistently be A-Z, a-z, and 0-9. And the ordering for later integer chain IDs should be by the variable type of integer instead of a string.
If the structure parser has to assign more than 9 new chain IDs, the 10th unique chain ID is 10 and thus making later information on each line move one digit forward, which will cause a problem when reading the resulting PDB.
Example: PDB ID 7tpt. See the screenshot in the attached figure.
Suggested change: Assign chain ID as follows: A-Z, a-z, and 0-9. This way, a maximum of 62 unique 1-digit chain IDs can be generated. And if the input structure has more than 62 unique chains (very extreme case), the
get_structure()
function can assign an integer starting from 10. But because of the limitation of the PDB format, these chain IDs are not supposed to be printed out by theget_file_str()
function.Besides, the current![7tpt_output_example](https://user-images.githubusercontent.com/73961654/202046855-369eb1d9-aeec-44cb-91a8-828cb7534109.png)
get_file_str()
function output chains by 0-9, A-Z, and a-z. A better order should consistently be A-Z, a-z, and 0-9. And the ordering for later integer chain IDs should be by the variable type of integer instead of a string.