Closed GoogleCodeExporter closed 9 years ago
I've got this working on tang. The tags are showing up, but the data in the
columns aren't. Is there something
else I should be doing?
Original comment by schulte....@gmail.com
on 18 Nov 2009 at 3:29
I had to modify the original NMR-STAR files to avoid having to hack in all
kinds of
exceptions in my code - have a look at the 1tut and 2otk original files. There
are
PDB tags in there that can come straight from the PDBx files - this might be an
issue
for Jurgen? Not sure how you generate the 'joint restraint coordinate' NMR-STAR
files.
Original comment by wfvran...@gmail.com
on 18 Nov 2009 at 4:29
OK.
Then we need Jurgen on this list. I'll take a look at the code.
Original comment by schulte....@gmail.com
on 18 Nov 2009 at 4:41
Ok, just echoing here for the sake of discussion what Chris mailed me.
"""
Basically, Jurgen, the list:
_Atom_site.Label_asym_ID
_Atom_site.Model_ID
_Atom_site.ID
_Atom_site.Label_entity_assembly_ID
_Atom_site.Label_entity_ID
_Atom_site.Label_comp_index_ID
_Atom_site.Label_comp_ID
_Atom_site.Label_atom_ID
_Atom_site.Type_symbol
_Atom_site.Cartn_x
_Atom_site.Cartn_y
_Atom_site.Cartn_z
_Atom_site.Occupancy
_Atom_site.Uncertainty
_Atom_site.PDB_ins_code
_Atom_site.Auth_asym_ID
_Atom_site.Auth_seq_ID
_Atom_site.Auth_comp_ID
_Atom_site.Auth_atom_ID
_Atom_site.Entry_ID
_Atom_site.Conformer_family_coord_set_ID
needs to be expanded to include other things for the pdbx files:
_Atom_site.Model_ID
_Atom_site.Model_site_ID
_Atom_site.ID
_Atom_site.Assembly_atom_ID
_Atom_site.Label_entity_assembly_ID
_Atom_site.Label_entity_ID
_Atom_site.Label_comp_index_ID
_Atom_site.Label_comp_ID
_Atom_site.Label_atom_ID
_Atom_site.Type_symbol
_Atom_site.Cartn_x
_Atom_site.Cartn_y
_Atom_site.Cartn_z
_Atom_site.Cartn_x_esd
_Atom_site.Cartn_y_esd
_Atom_site.Cartn_z_esd
_Atom_site.Occupancy
_Atom_site.Occupancy_esd
_Atom_site.Uncertainty
_Atom_site.Ordered_flag
_Atom_site.Footnote_ID
_Atom_site.PDBX_label_asym_ID
_Atom_site.PDBX_label_seq_ID
_Atom_site.PDBX_label_comp_ID
_Atom_site.PDBX_label_atom_ID
_Atom_site.PDBX_formal_charge
_Atom_site.PDBX_label_entity_ID
_Atom_site.PDB_record_ID
_Atom_site.PDB_model_num
_Atom_site.PDB_strand_id
_Atom_site.PDB_residue_no
_Atom_site.PDB_ins_code
_Atom_site.PDB_residue_name
_Atom_site.PDB_atom_name
_Atom_site.Auth_asym_ID
_Atom_site.Auth_chain_ID
_Atom_site.Auth_entity_assembly_ID
_Atom_site.Auth_seq_ID
_Atom_site.Auth_comp_ID
_Atom_site.Auth_atom_ID
_Atom_site.Auth_atom_name
_Atom_site.Details
_Atom_site.Entry_ID
_Atom_site.Conformer_family_coord_set_ID
"""
We remove: _Atom_site.Label_asym_ID and I will need to add the below and only
the below. Please confirm.
Also, please confirm we don't need tag changes in the restraints as the issue
title might suggest.
Sorry but I need detailed info like this in order to do it in one blow.
_Atom_site.Assembly_atom_ID
_Atom_site.Auth_atom_name
_Atom_site.Auth_chain_ID
_Atom_site.Auth_entity_assembly_ID
_Atom_site.Cartn_x_esd
_Atom_site.Cartn_y_esd
_Atom_site.Cartn_z_esd
_Atom_site.Details
_Atom_site.Footnote_ID
_Atom_site.Model_site_ID
_Atom_site.Occupancy_esd
_Atom_site.Ordered_flag
_Atom_site.PDBX_formal_charge
_Atom_site.PDBX_label_asym_ID
_Atom_site.PDBX_label_atom_ID
_Atom_site.PDBX_label_comp_ID
_Atom_site.PDBX_label_entity_ID
_Atom_site.PDBX_label_seq_ID
_Atom_site.PDB_atom_name
_Atom_site.PDB_model_num
_Atom_site.PDB_record_ID
_Atom_site.PDB_residue_name
_Atom_site.PDB_residue_no
_Atom_site.PDB_strand_id
Original comment by jurge...@gmail.com
on 19 Nov 2009 at 9:40
Eldon, please confirm so I can push this in.
Original comment by jurge...@gmail.com
on 16 Dec 2009 at 11:03
Jurgen,
I am sorry I completely missed this request. I started to look at it, but the
answeris more complicated than I expected. I will get back to you tonight or
over the
weekend.
Eldon
Original comment by webmas...@bmrb.wisc.edu
on 17 Dec 2009 at 9:11
Jurgen,
The goal is to extract the information from the 'auth' tags and the other tags
if
they are populated in the pdbx files and in the end associate this information
with
the 'PDB' tags in NMR-STAR. I believe what is needed in the first step is the
mapping
of the values from the pdbx tags to the corresponding NMR-STAR tags as shown in
the
table below. The final step would be to move the data from the 'Auth' tags in
NMR-STAR to the 'PDB' tags in NMR-STAR. Either Wim's software would do this
final
step when the restraint nomenclature and PDB nomenclature is made consistent or
would
your software do this. The final mapping of the information from the pdbx
files to
the NMR-STAR files is shown in Table II below.
Chris and Wim's comments on this may be needed.
Eldon
Table I.
pdbx NMR-STAR
_atom_site.auth_asym_id _Atom_site.Auth_asym_ID
_atom_site.auth_atom_id _Atom_site.Auth_atom_ID
_atom_site.auth_comp_id _Atom_site.Auth_comp_ID
_atom_site.auth_seq_id _Atom_site.Auth_seq_ID
_atom_site.pdbx_auth_alt_id _Atom_site.Auth_alt_ID
_atom_site.pdbx_auth_atom_name _Atom_site.Auth_atom_name
_atom_site.pdbx_PDB_atom_name _Atom_site.PDB_atom_name
_atom_site.pdbx_PDB_ins_code _Atom_site.PDB_ins_code
_atom_site.pdbx_PDB_model_num _Atom_site.PDB_model_num
_atom_site.pdbx_PDB_residue_name _Atom_site.PDB_residue_name
_atom_site.pdbx_PDB_residue_no _Atom_site.PDB_residue_no
_atom_site.pdbx_PDB_strand_id _Atom_site.PDB_strand_ID
Table II.
pdbx NMR-STAR
_atom_site.auth_asym_id _Atom_site.PDB_strand_ID
_atom_site.auth_atom_id _Atom_site.PDB_atom_name
_atom_site.auth_comp_id _Atom_site.PDB_residue_name
_atom_site.auth_seq_id _Atom_site.PDB_residue_no
_atom_site.pdbx_auth_alt_id _Atom_site.Auth_alt_ID
_atom_site.pdbx_auth_atom_name _Atom_site.Auth_atom_name
_atom_site.pdbx_PDB_ins_code _Atom_site.PDB_ins_code
_atom_site.pdbx_PDB_model_num _Atom_site.PDB_model_num
Original comment by webmas...@bmrb.wisc.edu
on 17 Dec 2009 at 9:52
For clarification: my code currently does the (final) mapping in Table II - I
modified the original files to reflect this in the examples I generated.
I figured it should be easy enough to regenerate the input files (with PBDx
coordinates and original restraint info), plus that way I can avoid having to
put in
hacks to deal with the Table I mappings.
Original comment by wfvran...@gmail.com
on 4 Jan 2010 at 10:57
A further note on comment 8: my code currently handles the NMR-STAR tags from
Table
II correctly, so these should be in the 'joined' coordinate/restraint file
generated
by Wattos!
Original comment by wfvran...@gmail.com
on 4 Jan 2010 at 12:49
Cool, I'll get on this.
Happy New Year everyone!
Original comment by jurge...@gmail.com
on 4 Jan 2010 at 12:56
Ok, I'm remapping the 5 tag names and adding the 2 new ones according to
(abbreviated after period):
pdbx NMR-STAR 3.1 NMR-STAR 3.x
--------------------------------------------------------------------------------
-
Label_asym_ID
Model_ID PDB_model_num
ID
Label_entity_assembly_ID
Label_entity_ID
Label_comp_index_ID
Label_comp_ID
Label_atom_ID
Type_symbol
Cartn_x
Cartn_y
Cartn_z
Occupancy
Uncertainty
PDB_ins_code
Auth_asym_ID PDB_strand_ID
Auth_seq_ID PDB_residue_no
Auth_comp_ID PDB_residue_name
Auth_atom_ID PDB_atom_name
Entry_ID
Conformer_family_coord_set_ID
pdbx_auth_alt_id Auth_alt_ID
pdbx_auth_atom_name Auth_atom_name
However, I'm not finding the new tags (like pdbx_auth_alt_id) in the mmCIF
files I get from
rsync.wwpdb.org::ftp/
Are these new tags to come?
Original comment by jurge...@gmail.com
on 4 Jan 2010 at 3:26
OK, got this in but the sorting is slightly off in Wattos output as it uses
Data/validict.20080404.1.str
Wim, can you check the attached output for 1brv? It doesn't contain the
restraints but they are left untouched.
Code checked in under Wattos revision 130. I don't know how to link between
these 2 projects though sorry.
Original comment by jurge...@gmail.com
on 4 Jan 2010 at 7:21
Attachments:
Model_ID is now missing from Conformer_family_coord_set... it should be there
as far
as I'm aware.
Original comment by wfvran...@gmail.com
on 5 Jan 2010 at 11:10
I remapped it according to the table in my comment 11.
It's in the table II of comment 7.
I don't suppose we need to duplicate it do we?
Original comment by jurge...@gmail.com
on 5 Jan 2010 at 12:37
From what I can tell it's an obligatory value in the NMR-STAR, so yes. Maybe the
PDB_model_num is not necessary then, not sure.
Original comment by wfvran...@gmail.com
on 5 Jan 2010 at 12:46
Eldon, can you clarify this point please?
Original comment by jurge...@gmail.com
on 5 Jan 2010 at 12:48
The Model_ID value is located in the '_Atom_site' table.
'_Conformer_family_coord_set' describes the save frame for the 'family' and the
save
frame contains the coordinates for all models as listed in the 'Atom_site'
table.
There has never been a '_Conformer_family_coord_set.Model_ID' tag. If there
were such
a tag at the save frame level then there would have to be a
'Conformer_family_coord_set' save frame for every reported model.
I have corrected the errors in the dictionary (hopefully) and put the new
version in svn.
Original comment by Eldon.Ul...@gmail.com
on 5 Jan 2010 at 2:43
>The Model_ID value is located in the '_Atom_site' table.
Are you asking me to keep it in and add another column _Atom_site.
PDB_model_num with the same values?
That wouldn't make sense right? Except for issue 168 they're the same...
Original comment by jurge...@gmail.com
on 5 Jan 2010 at 3:08
Ah I did mean Atom_site in Comment 13... apologies for any confusion caused!
Jurgen's
last point still stands...
Original comment by wfvran...@gmail.com
on 5 Jan 2010 at 3:31
Wim, do you need the _Atom_site.PDB_model_num tag to be populated? Do you
always use
PDB model '1' when the atom nomenclature is made consistent? For completeness, I
would go with putting in the _Atom_site.PDB_model_num tag and populating it
with the
redundant values. Overall, I feel this whole mess is redundant.
Original comment by Eldon.Ul...@gmail.com
on 5 Jan 2010 at 4:49
No I don't need (or particularly want) PDB_model_num, I'm using Model_ID at the
moment. Some model number indication is necessary.
Original comment by wfvran...@gmail.com
on 5 Jan 2010 at 4:53
It is fine with me, then, to leave out the PDB_model_num.
Original comment by Eldon.Ul...@gmail.com
on 5 Jan 2010 at 5:33
Cool, I'll keep the original on that one.
Can somebody point me to an example of a mmCIF with the
_atom_site.pdbx_auth_alt_id tag? Is it safe to leave
out for this iteration?
Original comment by jurge...@gmail.com
on 5 Jan 2010 at 6:46
I have never seen one. I think you can leave it out.
Original comment by Eldon.Ul...@gmail.com
on 5 Jan 2010 at 7:17
Attached is the new version. Code committed in Wattos revision 131.
Original comment by jurge...@gmail.com
on 6 Jan 2010 at 8:46
Attachments:
Original comment by jurge...@gmail.com
on 6 Jan 2010 at 8:46
Two more issues to resolve:
1. It is now 'PDB_strand_ID', where it used to be 'PDB_strand_id'? The casing
makes a
difference in my code... also did we want to follow the exact mmCIF naming?
2. The tag '_Atom_site.Label_asym_ID' (in Jurgen's file) does not exist as far
as I
can tell. Should this be '_Atom_site.PDBX_label_asym_ID' in the NMR-STAR?
Original comment by wfvran...@gmail.com
on 6 Jan 2010 at 9:10
Eldon, please resolve Wim's issues. I'm happy to change them anyway you like.
Thanks
Original comment by jurge...@gmail.com
on 6 Jan 2010 at 9:21
Issues raised by Wim:
1. In NMR-STAR, the convention has been to capitalize 'ID' and I would like to
stick
to this. I do not feel there is a need to follow exact mmCIF naming. If their
conventions were followed, some tags would become
'Atom_site.NMR-STAR_PDBX_PDB...'. I
think all of this redundant nomenclature should be captured in a separate table
and
not in the Atom_site table, but that is another discussion.
2. Yes, the '_Atom_site.Label_asym_ID' tag should be changed to
'_Atom_site.PDBX_label_asym_ID'. Sorry, I missed this one in the past.
Original comment by Eldon.Ul...@gmail.com
on 6 Jan 2010 at 4:00
_Atom_site.PDBX_label_asym_ID or:
_Atom_site.PDB_label_asym_ID without the X? It's the only one with an X now.
Original comment by jurge...@gmail.com
on 6 Jan 2010 at 4:11
I verified against:
http://www.bmrb.wisc.edu/dictionary/3.1html_frame/frame_AtomSite.html#_Atom_site
.PDBX_label_asym_ID
and it seems to be with X
Committed in Wattos revision 132.
Original comment by jurge...@gmail.com
on 6 Jan 2010 at 4:23
Attachments:
Did I really mess up somewhere? Below are the tags used to record data from pdbx
files and from PDB files. The asym_ID is a PDBX construct and does not exist in
the
PDB set. I think way back when 'asym_ID' was useful in mapping to
'entity_assembly_ID'. The 'asym_ID' tag also maps to the
'_Entity_assembly.Asym_ID'
tag and so is probably useful to keep. The other 'PDBX' tags are equivalent to
the
NMR-STAR tags and I do not feel need to be included. I could be wrong, but I
think
John Westbrook's request pertained to the 'PDB' tags, where 'PDB_strand_ID' is
usually but I am not sure always equivalent to the 'asym_ID' value.
_Atom_site.PDBX_label_asym_ID
_Atom_site.PDBX_label_seq_ID
_Atom_site.PDBX_label_comp_ID
_Atom_site.PDBX_label_atom_ID
_Atom_site.PDBX_formal_charge
_Atom_site.PDBX_label_entity_ID
_Atom_site.PDB_record_ID
_Atom_site.PDB_model_num
_Atom_site.PDB_strand_ID
_Atom_site.PDB_ins_code
_Atom_site.PDB_residue_no
_Atom_site.PDB_residue_name
_Atom_site.PDB_atom_name
Original comment by Eldon.Ul...@gmail.com
on 6 Jan 2010 at 4:28
If all here are happy with the current file I would like to close the issue.
Original comment by jurge...@gmail.com
on 6 Jan 2010 at 6:38
Is it ready to be tested?
Original comment by schulte....@gmail.com
on 6 Jan 2010 at 7:07
As far as Wattos is concerned I believe so. The code is in the last revision.
Original comment by jurge...@gmail.com
on 7 Jan 2010 at 8:41
And Wattos revision 133 now makes use of the latest dictionary for sorting as
well.
Wim, what does we need to update in FC to test?
Original comment by jurge...@gmail.com
on 7 Jan 2010 at 8:49
There are likely to be some problems - I was using the auth_ tags for mapping,
and
since they're now PDB_ I have to make sure my code still works correctly.
One question to Eldon: are the auth_ tags still necessary? If not (and they can
go),
then it's easy enough to update my stuff, otherwise it'll take a while.
Anyway the following steps are necessary (1-2 can happen regardless of what I
still
have to do):
1. Remake all joined coordinate/restraint NMR-STAR files
2. Let me know where I can find them when done
3. I'll run some tests and update my code to make sure it's all working
4. I'll give you the update procedure when it's ready
Original comment by wfvran...@gmail.com
on 7 Jan 2010 at 10:50
Since Chris has the up to date versions it might be best to do 1 and 2 in
Madison. Chris can you 'volunteer'?
Original comment by jurge...@gmail.com
on 7 Jan 2010 at 10:53
The 'Auth' tags would not be populated now in '_Atom_site'. Just to be clear,
they
are still needed and would be populated in the restraints tables, as I think
they
actually represent the original author nomenclature as extracted from the
restraints
files.
Original comment by Eldon.Ul...@gmail.com
on 7 Jan 2010 at 2:19
Yes. I'll start testing on tang. Is Wattos up to date there, or do I need to do
an svn update?
Original comment by schulte....@gmail.com
on 7 Jan 2010 at 2:21
OK Eldon, in that case I'll wait for the new files and make sure everything
still works.
Original comment by wfvran...@gmail.com
on 7 Jan 2010 at 2:29
Let me know when to update and get going.
Original comment by schulte....@gmail.com
on 7 Jan 2010 at 2:47
Chris can you 'volunteer'?
Everything seems to need updating.
Original comment by jurge...@gmail.com
on 21 Jan 2010 at 2:40
ccpn and recoord should be recent, but I can do another update if need be.
I will update wattos and the NRG on tang today and test it all.
Original comment by schulte....@gmail.com
on 21 Jan 2010 at 3:37
I've updated everything on tang and these are the latest versions of the test
files we've been using.
Note: I had to comment out all chainMapping directives for 1brv in the
presetDict in order to get it to convert.
Please confirm that these are what we want. I'm going test some more on tang.
Original comment by schulte....@gmail.com
on 21 Jan 2010 at 7:42
Attachments:
I am not able to upload 3 files at once, so here they are, separately.
Original comment by schulte....@gmail.com
on 21 Jan 2010 at 7:44
Attachments:
Last one for now. This was too big to upload uncompressed.
Original comment by schulte....@gmail.com
on 21 Jan 2010 at 7:48
Attachments:
These are the wrong files - I need the ones prior to the linking process (so
what
comes out of wattos, see comment 37).
Original comment by wfvran...@gmail.com
on 26 Jan 2010 at 10:58
Sorry about that. All joined coordinates/restraints files for the new update
are located in
/big/docr/ccpn_tmp/data/archives/bmrb/nmrRestrGrid
The number of files I was able to produce was limited by the fact that /big has
run out of memory, but
anything after jan 21 was done with the update.
You will have to ssh onto lionfish@bmrb.wisc.edu, and then to tang.
If you need other files now, I can send them. Let me know if we need to get
Dimitiri to fix anything to let you
log on.
Original comment by schulte....@gmail.com
on 26 Jan 2010 at 2:40
OK I have now tested and checked in the code, it is giving me the same results
as before.
Update:
CCPN sourceForge repository, python/ directory.
Note that I have included a new bit of format/CCPN sequence alignment code in
case
the original code cannot find a mapping - this should only affect new entries.
There might be issues with format chain codes that are labelled '_' for mapping
purposes - this was a mistake that crept in a while ago, I don't think it was a
problem for you but just so you know.
Original comment by wfvran...@gmail.com
on 28 Jan 2010 at 11:22
Original issue reported on code.google.com by
wfvran...@gmail.com
on 16 Nov 2009 at 3:58Attachments: