Open butlerpd opened 2 months ago
I did more digging on this and my initial guess was wrong. The issue is related to creating atomic connections to lines in files that seem to be individual atoms, but that aren't read in as atoms by the PDB reader.
To reproduce:
1n04.pdb
as Nuclear Data
.list index out of range
errors are thrownMore information:
In the 1n04.pdb
file, after line 5801, the label for the atoms changes from ATOM
to TER
and then to HETATM
. The ATOM
lines are loaded as atomic points, but the later values are ignored. When creating atomic connections (CONECT
lines), many connections are linking atoms labeled as TER
and HETATM
, but they are not in the list of atoms generated by the PDB reader. This is where the index out of range errors are coming from.
The real issue: This could be one of two things. The first - malformed data sets. The data sets might not have the correct labels on each line. The second - the reader does not read all values from the file correctly. The first is harder to handle and we would need to know what alternate nomenclature is used for atomic lines in PDB files.
TO start the diagnosis, we need the PDB file specification to figure which it is. If anyone has it, please link/attach it here.
Found it: https://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html
Reading through this, HETATM
represent "non-standard" chemical coordinates
. These will likely need to be read in.
Describe the bug When loading some PDB files, such as the Apoferritin one, a slew of
gets thrown. @krzywon says these are bogus due to something in the faster reading of pdb. The file eventually loads and can be drawn and a curve generated.
To Reproduce Steps to reproduce the behavior:
Log Explorer
Expected behavior No errors thrown
Screenshots If applicable, add screenshots to help explain your problem.
SasView version (please complete the following information):
Operating system (please complete the following information):