Closed jacob-r-anderson closed 2 years ago
Can you please provide a PDB file (you will need to add .txt
suffix or compress as ZIP) so we can reproduce the error?
According to https://www.wwpdb.org/documentation/file-format-content/format33/sect9.html:
A model cannot have more than 99,999 atoms. Where the entry does not contain an ensemble of models, then the entry cannot have more than 99,999 atoms. Entries that go beyond this atom limit must be split into multiple entries, each containing no more than the limits specified above.
@intendo has a good point. @jacob-r-anderson do you have this structure in CIF format?
@intendo @sobolevnrm PDBs with over 99,999 atoms are the primary data object we have worked with for several years. One is attached. Other softwares handle these PDBs (Coot, Chimera, ChimeraX, Phenix).
@sobolevnrm should we create a new issue (feature request) to handle PDB files that don't conform to the PDB 3.3 format? It would require a change to the parser and then propagate throughout the code base from there.
@sobolevnrm should we create a new issue (feature request) to handle PDB files that don't conform to the PDB 3.3 format? It would require a change to the parser and then propagate throughout the code base from there.
Yes, I'll create a low-priority issue for this. In the meantime, @jacob-r-anderson can you try CIF format?
pdb2pqr 3.4.1, Ubuntu 18.04.
I have a large protein I am hoping to input to pdb2pqr30. Oddly, this works on an older version of pdb2pqr that was packaged with pymol (--version doesn't give me a version number with my pymol version). When I run with pdb2pqr30 I get this error for the same input file.
ValueError: invalid literal for int() with base 10: 'A0000'
If I check the file for lines containing this 'A0000' I find this as the first hit.
I presume this is due to a problem handling the letter A in the in the second column.