Open zacharyrs opened 3 years ago
Unfortunately this is breaking cross compatibility with mmtf-python
, which by default dumps everything as 64bit floats (doubles).
The msgpack-python
implementation doesn't support packing a particular field (the transforms) as 64bit floats, and everything else as 32bit floats - see here.
I have a partial workaround, by making mmtf-python
follow the same decisions as here (all 32bit except the transforms list) - https://github.com/rcsb/mmtf-python/issues/50.
Good catch @zacharyrs ! Thanks for the detailed report.
Changing the RCSB mmtf files is doable but as you say may cause quite some trouble. I like your python workaround as a solution. However, to be consistent the spec would have to officially acknowledge that ncsOperList uses doubles, right?
One important note. MMTF is now is in minimal maintenance mode. The preferred compressed format for PDB data is BinaryCIF.
Thanks @josemduarte!
Yes, the python solution basically just means both implementations violate the specification in the same way. It avoids the hassle of breaking things.
I didn't realise mmtf
had been dropped to maintenance... I assume BinaryCIF
follows the CIF
spec, it's just encoded?
I recall CIF
not caring about bond information, which was what I liked about mmtf
- I guess I'll have to read into it more.
I assume BinaryCIF follows the CIF spec, it's just encoded?
Yes, that's correct
I recall CIF not caring about bond information, which was what I liked about mmtf - I guess I'll have to read into it more.
Bond information is available but indirectly via the chemical component dictionary
Bond information is available but indirectly via the chemical component dictionary
Is that guaranteed for all molecules or is it optional?
The chemical component dictionary contains all intra-residue bond information. But it is not embedded within the structure BCIF files. We will consider offering the whole chemical component dictionary as one BCIF bundle that should make it more convenient to use.
When unpacking an
mmtf
file, this implementation expects doubles for the transformation matrices. The specification outlines the float type as 32bit, and says this field is populated with floats. Not sure if this should be changed - I suspect it might break parsing existingmmtf
files, so maybe it needs to accept both types?