Open jsoerensen opened 4 years ago
Looking at it, translating revision record would be problematic.
For example, in 6LU7:
loop_
_pdbx_audit_revision_history.ordinal
_pdbx_audit_revision_history.data_content_type
_pdbx_audit_revision_history.major_revision
_pdbx_audit_revision_history.minor_revision
_pdbx_audit_revision_history.revision_date
1 'Structure model' 1 0 2020-02-05
2 'Structure model' 2 0 2020-02-12
3 'Structure model' 2 1 2020-02-19
4 'Structure model' 2 2 2020-02-26
5 'Structure model' 2 3 2020-03-11
#
loop_
_pdbx_audit_revision_details.ordinal
_pdbx_audit_revision_details.revision_ordinal
_pdbx_audit_revision_details.data_content_type
_pdbx_audit_revision_details.provider
_pdbx_audit_revision_details.type
_pdbx_audit_revision_details.description
_pdbx_audit_revision_details.details
1 1 'Structure model' repository 'Initial release' ? ?
2 2 'Structure model' author 'Coordinate replacement' 'Ligand geometry' ?
#
loop_
_pdbx_audit_revision_group.ordinal
_pdbx_audit_revision_group.revision_ordinal
_pdbx_audit_revision_group.data_content_type
_pdbx_audit_revision_group.group
1 2 'Structure model' Advisory
2 2 'Structure model' 'Atomic model'
3 2 'Structure model' 'Data collection'
4 2 'Structure model' 'Database references'
5 2 'Structure model' 'Derived calculations'
6 2 'Structure model' 'Refinement description'
7 2 'Structure model' 'Structure summary'
8 3 'Structure model' 'Database references'
9 3 'Structure model' 'Structure summary'
10 4 'Structure model' 'Data collection'
11 5 'Structure model' 'Source and taxonomy'
12 5 'Structure model' 'Structure summary'
#
loop_
_pdbx_audit_revision_category.ordinal
_pdbx_audit_revision_category.revision_ordinal
_pdbx_audit_revision_category.data_content_type
_pdbx_audit_revision_category.category
1 2 'Structure model' atom_site
2 2 'Structure model' citation
3 2 'Structure model' entity
4 2 'Structure model' pdbx_nonpoly_scheme
5 2 'Structure model' pdbx_struct_assembly_prop
6 2 'Structure model' pdbx_struct_sheet_hbond
7 2 'Structure model' pdbx_struct_special_symmetry
8 2 'Structure model' pdbx_validate_rmsd_bond
9 2 'Structure model' pdbx_validate_symm_contact
10 2 'Structure model' pdbx_validate_torsion
11 2 'Structure model' refine
12 2 'Structure model' refine_hist
13 2 'Structure model' refine_ls_shell
14 2 'Structure model' software
15 2 'Structure model' struct
16 2 'Structure model' struct_conn
17 2 'Structure model' struct_site
18 2 'Structure model' struct_site_gen
19 3 'Structure model' citation
20 3 'Structure model' struct
21 4 'Structure model' diffrn_detector
22 5 'Structure model' entity
23 5 'Structure model' entity_src_gen
24 5 'Structure model' struct
#
loop_
_pdbx_audit_revision_item.ordinal
_pdbx_audit_revision_item.revision_ordinal
_pdbx_audit_revision_item.data_content_type
_pdbx_audit_revision_item.item
1 2 'Structure model' '_citation.title'
2 2 'Structure model' '_entity.pdbx_number_of_molecules'
3 2 'Structure model' '_pdbx_struct_assembly_prop.value'
4 2 'Structure model' '_pdbx_struct_sheet_hbond.range_1_auth_comp_id'
...
would need to be translated to:
REVDAT 5 11-MAR-20 6LU7 1 COMPND SOURCE
REVDAT 4 26-FEB-20 6LU7 1 REMARK
REVDAT 3 19-FEB-20 6LU7 1 TITLE JRNL
REVDAT 2 12-FEB-20 6LU7 1 TITLE COMPND JRNL REMARK
REVDAT 2 2 1 SHEET LINK SITE ATOM
REVDAT 1 05-FEB-20 6LU7 0
Out of curiosity, what do you need it for?
I think the code I posted does something close to that. Although I've left out the last column with revision reasons, but I could add that in. Mainly, we store the original date, and the last revision number and date in metadata in when deposit these in a database. Since structures can be revised, it's important for us to know which revision we currently have.
I don't mind posting the above as a PR with the extra column for the revision reason added, if that helps.
I meant that the last columns of REVDAT would be difficult to generate. How would you do it? From category?
Ah that is a fair point - I'm not sure if the RCSB has a conversion table. I'll look.
if I may ask - why do you switch between mmCIF and PDB?
It's a fair question, we convert the MMCIF header to a PDB-style header for historical reasons. The work involved switching our current codebase to parse each natively would be significant. And we only need to do this for those structure where there is only an MMCIF structure and not a corresponding PDB form.
Sadly, there doesn't seem to be a proper mapping between the PDB and MMCIF notations. http://mmcif.wwpdb.org/docs/pdb_to_pdbx_correspondences.html#REVDAT
I'm inclined to leave REVDAT out, at least for now. Maybe you can find a workaround to store the last revision number and date in the database.
Given the limited mappings from the wwpdb, I agree with you. The code above does do that I need so I have it on a fork. I’d be much happier prioritizing the DBREF data instead.
in the write_remarks function in to_pdb.hpp , could you add the REVDAT entry. I've pasted some code that should work below.
and in mmcif.hpp