rcsb / mmtf

The specification of the MMTF format for biological structures
http://mmtf.rcsb.org/
44 stars 17 forks source link

[feature request] Please support the secondary structure information and other essential information supported by the pdb format #41

Closed yurivict closed 5 years ago

yurivict commented 5 years ago

Continuing from https://github.com/rcsb/mmtf-cpp/issues/28

The recently added extra fields might be a solution for any additional data you would like to store.

I was downloading data from the PDB database in the MMTF format, but it lacks the secondary structure information.

This also makes the "PDB archive size comparison" graph on https://mmtf.rcsb.org/ invalid since the PDB format has more information in it.

pwrose commented 5 years ago

The mmtf files do contain secondary structure information, however, it has been recalculated using the DSSP implementation in BioJava. This was done since the original secondary structure information has not been consistently assigned, e.g., using different software, different, version, even done manually for older structures. To provide consistency, we have recalculated secondary structure assignments (see: https://github.com/rcsb/mmtf/blob/master/spec.md#secstructlist).

Formula information is not included since it can be easily calculated. The MMTF format is designed to avoid storing redundant information and it only contains the most often used data, it's not a replacement for the other formats. On the other hand, MMTF contains additional information not found in PDB or PDBx/mmCIF files, e.g., bonds and bond orders for the entire structure, including ligands.

For more details what data are included in MMTF files, please see: https://doi.org/10.1371/journal.pcbi.1005575

On Tue, Apr 16, 2019 at 10:17 AM yuri@FreeBSD notifications@github.com wrote:

Continuing from rcsb/mmtf-cpp#28 https://github.com/rcsb/mmtf-cpp/issues/28

The recently added extra fields might be a solution for any additional data you would like to store.

I was downloading data from the PDB database in the MMTF format, but it lacks the secondary structure information.

This also makes the "PDB archive size comparison" graph on https://mmtf.rcsb.org/ invalid since the PDB format has more information in it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rcsb/mmtf/issues/41, or mute the thread https://github.com/notifications/unsubscribe-auth/ADuwEJZeTVhXQgUan0RPaxaPPAQOkja3ks5vhgV7gaJpZM4czPBo .

yurivict commented 5 years ago

@pwrose Thank you for the clarification.