google-code-export / nmrrestrntsgrid

Automatically exported from code.google.com/p/nmrrestrntsgrid
0 stars 0 forks source link

Update on chemComp handling #219

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
A message to let you know that I've updated the chemComp information in the
CcpForge repository (ccpn-chemcomp directory). They should now be
up-to-date until the end of July.

I won't have time to implement the automatic weekly update, but it's
getting closer. I'll do this in September.

I've also implemented a change so that missing chemComps in a
protein/DNA/RNA sequence are replaced by a Xxx residue, so the sequence
numbering does not get messed up. For this update the following SF
repository directories:

data/ccp/
python/ccpnmr/format/

Original issue reported on code.google.com by wfvran...@gmail.com on 30 Jul 2009 at 10:42

GoogleCodeExporter commented 9 years ago
JFD adding some correspondence on this issue from Chris/Wim

if you look at the _Entity_comp_index loop here:

http://restraintsgrid.bmrb.wisc.edu/NRG/MRGridServlet?db_username=wattos1&mrbloc
k_id=180834&pdb_id=2kb3&request_type=block

and compare it with the seqres, you will see that residue 15 was
thrown out. I tried to skip over the residue in the presets, but that
didn't work (see below). This mismatch makes the maximum violation go
to nearly 8.5.

chris
A   1  MET    2  SER    3  ASP    4  ASN    5  ASN    6  GLY    7  THR
  8  PRO    9  GLU   10  PRO
A  11  GLN   12  VAL   13  GLU   14  THR   15  TPO   16  SER   17  VAL
 18  PHE   19  ARG   20  ALA
A  21  ASP   22  LEU   23  LEU   24  LYS   25  GLU   26  MET   27  GLU
 28  SER   29  SER   30  THR
A  31  GLY   32  THR   33  ALA   34  PRO   35  ALA   36  SER   37  THR
 38  GLY   39  ALA   40  GLU
A  41  ASN   42  LEU   43  PRO   44  ALA   45  GLY   46  SER   47  ALA
 48  LEU   49  LEU   50  VAL
A  51  VAL   52  LYS   53  ARG   54  GLY   55  PRO   56  ASN   57  ALA
 58  GLY   59  ALA   60  ARG
A  61  PHE   62  LEU   63  LEU   64  ASP   65  GLN   66  PRO   67  THR
 68  THR   69  THR   70  ALA
A  71  GLY   72  ARG   73  HIS   74  PRO   75  GLU   76  SER   77  ASP
 78  ILE   79  PHE   80  LEU
A  81  ASP   82  ASP   83  VAL   84  THR   85  VAL   86  SER   87  ARG
 88  ARG   89  HIS   90  ALA
A  91  GLU   92  PHE   93  ARG   94  ILE   95  ASN   96  GLU   97  GLY
 98  GLU   99  PHE  100  GLU
A 101  VAL  102  VAL  103  ASP  104  VAL  105  GLY  106  SER  107  LEU
 108  ASN  109  GLY  110  THR
A 111  TYR  112  VAL  113  ASN  114  ARG  115  GLU  116  PRO  117  ARG
 118  ASN  119  ALA  120  GLN
A 121  VAL  122  MET  123  GLN  124  THR  125  GLY  126  ASP  127  GLU
 128  ILE  129  GLN  130  ILE
A 131  GLY  132  LYS  133  PHE  134  ARG  135  LEU  136  VAL  137  PHE
 138  LEU  139  ALA  140  GLY
A 141  PRO  142  ALA  143  GLU

PRESET

'2kb3': {

 'authors': ['Chris Schulte'],
 'comment': """
 Final mapping: [['A', 'A', 3, 1]]
 """,

 'linkResonances': {

  'keywds': {
#       'forceChainMappings': [['A', 'A', 1, 1], ['A', 'A', 16, 2]],
# did not work - made it worse
      'forceChainMappings': [['A', 'A', 1, 1]],

               'addNameMappings':{
           'ALL': [ ["HN","H"] ],
                },
     },
   },
 },

On Wed, Jul 29, 2009 at 7:58 AM, Wim Vranken<wim@ebi.ac.uk> wrote:
> OK. I've already updated the chemComp repository with what should be all the
> new ones since the March update - let me know if that works for you!
>
> The weekly automatic update won't be before September though, 

Original comment by jurge...@gmail.com on 4 Aug 2009 at 2:42

GoogleCodeExporter commented 9 years ago

Original comment by jurge...@gmail.com on 4 Aug 2009 at 2:46

GoogleCodeExporter commented 9 years ago
Checking the new code base with entry 2kfu from issue 215

I had to add this:
    molType = None  # JFD adds bogus def for 2kfu it wasn't defined on line 6005
in DataFormat.py

to prevent it from crashing but issue remains.
The sequence was still missing the uncommon residue.

Wim, could you check the code with this entry please?

Original comment by jurge...@gmail.com on 4 Aug 2009 at 3:57

GoogleCodeExporter commented 9 years ago
Issue 215 has been merged into this issue.

Original comment by jurge...@gmail.com on 4 Aug 2009 at 3:59

GoogleCodeExporter commented 9 years ago
OK remove that change Jurgen - I've fixed it correctly (I hope!). The change 
you made
will definitely break the fix for inserting the Xxx chemComp.

So check out the new version from DataFormat.py!

Original comment by wfvran...@gmail.com on 5 Aug 2009 at 8:41

GoogleCodeExporter commented 9 years ago
I might have messed up the linking between this issue 219 and issue 215.

This issue 215 originally on entry 2kb3 is still a problem. The residue 15 TPO 
is thrown out by FC with the 
attached input. I did just update the DataFormat.py file for sf.net.

Original comment by jurge...@gmail.com on 5 Aug 2009 at 4:40

Attachments:

GoogleCodeExporter commented 9 years ago
This works for me on my laptop, not sure what the issue is. Can you send me the 
.log
files?

Original comment by wfvran...@gmail.com on 6 Aug 2009 at 11:11

GoogleCodeExporter commented 9 years ago
Here is the linkNmrStarData.log. Let me know if you need something else.

Original comment by schulte....@gmail.com on 6 Aug 2009 at 1:50

Attachments:

GoogleCodeExporter commented 9 years ago
Sigh. I don't know. For me it says:

  Selecting naming system PDB_REMED.
  Molecule set to type protein from sequence file information.
  ChemComp file protein+Tpo+msd_ccpnRef_2007-12-11-10-20-25_00002.xml copied to
/Users/wim/workspace/stable/all/data/recoord/2kb3/linkNmrStarData/ccp/molecule/C
hemComp...

There no such protein+Tpo+.. file on your disk anywhere?

Original comment by wfvran...@gmail.com on 6 Aug 2009 at 7:41

GoogleCodeExporter commented 9 years ago
Yes, there is. That's what's so frustrating

/big/docr/ccpn-chemcomp/data/pdbe/chemComp/archive/ChemComp
/protein/protein+Tpo+msd_ccpnRef_2007-12-11-10-20-25_00002.xml

Original comment by schulte....@gmail.com on 6 Aug 2009 at 7:54

GoogleCodeExporter commented 9 years ago
I have it:

cd '/Users/jd/ccpn-chemcomp/data/pdbe/chemComp/archive/ChemComp/protein/'
l protein+Tpo*
-rw-r--r--  1 jd  jd  40160 Mar 12 10:58 
protein+Tpo+msd_ccpnRef_2007-12-11-10-20-25_00002.xml

but the merged star file still skips it:
        14 . THR . rr_2kb3 1
        15 . SER . rr_2kb3 1
        16 . VAL . rr_2kb3 1

Original comment by jurge...@gmail.com on 6 Aug 2009 at 7:55

GoogleCodeExporter commented 9 years ago
BTW: There is a symbolic linc to 
/big/docr/ccpn-chemcomp
in
/big/docr/workspace/ccpn

Original comment by schulte....@gmail.com on 6 Aug 2009 at 7:55

GoogleCodeExporter commented 9 years ago
Right and in msd/adatah/localConstants.py (or Constants.py) you're setting the
chemCompArchiveDataDir?

Original comment by wfvran...@gmail.com on 6 Aug 2009 at 8:08

GoogleCodeExporter commented 9 years ago
I never touched anything in ccpn or recoord on tang.

there doesn't seem to be any reference to chem-comp or chem_comp in 
/big/docr/workspace/ccpn/msd/adatah/localConstants.py

Where is it set otherwise? Sometimes chem-comps are there. Sometimes they're 
not.

Original comment by schulte....@gmail.com on 6 Aug 2009 at 8:18

GoogleCodeExporter commented 9 years ago
Try the following in /big/docr/workspace/ccpn/msd/adatah/localConstants.py

chemCompArchiveDataDir = 
'/big/docr/ccpn-chemcomp/data/pdbe/chemComp/archive/ChemComp'

see if it works.

Alternatively make a link from /big/docr/ccpn/data/pdbe to
/big/docr/ccpn-chemcomp/data/pdbe, see what that gives.

Original comment by wfvran...@gmail.com on 6 Aug 2009 at 8:53

GoogleCodeExporter commented 9 years ago
I created a localConstants.py in 
/big/docr/workspace/ccpn/python/msd/adatah/
and added the line.

ERROR 2kb3 in /big/docr/workspace/recoord/python/recoord2/msd/linkNmrStarData.py
please check the log: /big/docr/NRG/link/2kb3/2kb3_merge.log (it hasn't been 
copied to: 
/big/docr/ccpn_tmp/data/recoord/2kb3/linkNmrStarData.summary yet because 
aborting entry at this point.

Original comment by schulte....@gmail.com on 6 Aug 2009 at 10:25

Attachments:

GoogleCodeExporter commented 9 years ago
So does the directory

/big/docr/workspace/ccpn/data/recoord/

exist or not? These problems are mostly because we're all using a different 
setup to
connect the repositories... different behaviour!

Original comment by katekeme...@gmail.com on 7 Aug 2009 at 9:04

GoogleCodeExporter commented 9 years ago
And that was me, the last comment, of course...

Original comment by wfvran...@gmail.com on 7 Aug 2009 at 9:04

GoogleCodeExporter commented 9 years ago

> So does the directory
> /big/docr/workspace/ccpn/data/recoord/
> exist or not?

no. We have this
/big/docr/workspace/ccpn/data/
and these are the directories in it.  
ccp  ccpnmr  CVS

This is the same on tang (old server) and grunt (new)

Original comment by schulte....@gmail.com on 7 Aug 2009 at 1:21

GoogleCodeExporter commented 9 years ago
Can we close this issue?

Original comment by jurge...@gmail.com on 23 Sep 2009 at 8:57

GoogleCodeExporter commented 9 years ago
Probably, let me check a few things first.

Original comment by schulte....@gmail.com on 23 Sep 2009 at 1:25

GoogleCodeExporter commented 9 years ago
As followup, there is now an option to 'autocreate' chemComps when they're 
found in
the coordinates and they don't exist (yet).

This option is now turned on in linkNmrStarData.py, you have to update:

python/recoord2/msd/linkNmrStarData.py
python/ccpnmr/format/converters/NmrStarFormat.py

The option itself is tested on PDB files, but I haven't tried it for this 
particular
case (no test project).

Original comment by katekeme...@gmail.com on 16 Nov 2009 at 1:13

GoogleCodeExporter commented 9 years ago
And again, this last comment is by me.

Original comment by wfvran...@gmail.com on 16 Nov 2009 at 1:56

GoogleCodeExporter commented 9 years ago
Chris, can you update the code and try again.
Currently, the problem is still showing.

Original comment by jurge...@gmail.com on 19 Nov 2009 at 9:16

GoogleCodeExporter commented 9 years ago
I have the new updates loaded on tang and it has "autoCreateChemComps = True" 
in linkNmrStarData.py. 
However, that update also produces the newly formatted structures and 
constraints saveframes. 

I'm going to hold off on implementing this on grunt for a few days until there 
is a little more clarity on what all 
we need to update.

Original comment by schulte....@gmail.com on 19 Nov 2009 at 3:42

GoogleCodeExporter commented 9 years ago
Yes the changes are all in there. For the chemComp stuff, you only need to 
update the 

ccp.format.general.formatIO
ccpnmr.format.converters.DataFormat

files. That should work. Anyway make sure you back up the old files before 
doing the
update of these two files.

Original comment by wfvran...@gmail.com on 20 Nov 2009 at 9:54

GoogleCodeExporter commented 9 years ago
I updated 
ccp.format.general.formatIO
ccpnmr.format.converters.DataFormat
Then ran makePython.py

Then I updated linkNmrStarData.py and reprocessed 2kb3. There was no difference 
in the conversion rate.

Original comment by schulte....@gmail.com on 20 Nov 2009 at 3:22

GoogleCodeExporter commented 9 years ago
Well this one is still a mystery to me. First of all, Tpo should be picked up. 
It's
not, and we still haven't figured out why. If it didn't exist, then it should 
now
work anyway - as long as this 'autoCreateChemComps = True' is activated when 
reading
the NMR-STAR file in linkNmrStarData.py:

self.readNmrStarFile(inNmrStarFile, version = self.originalNmrStarVersion, 
maxNum =
self.numModelsToRead, autoCreateChemComps = True)

Original comment by wfvran...@gmail.com on 23 Nov 2009 at 1:26

GoogleCodeExporter commented 9 years ago
Sounds like this can be tried on tang as a development issue.
Chris, I'm lowering the priority so you can pick your battle.
Thanks Wim!

Original comment by jurge...@gmail.com on 23 Nov 2009 at 2:36

GoogleCodeExporter commented 9 years ago
I updated python/ccpnmr/format/converters/NmrStarFormat.py and reprocessed 
2kb3. It now crashes.

Triplet matches
Restraint    SEQRS Offset
   A 120 GLN .   .   .
Start guessing.
ERROR 2kb3 in 
/raid/docr/workspace/recoord/python/recoord2/msd/linkNmrStarData.py
please check the log: /raid/docr/NRG/link/2kb3/2kb3_merge.log (it hasn't been 
copied to: 
/raid/docr/ccpn_tmp/data/recoord/2kb3/linkNmrStarData.summary yet because 
aborting entry at this point.
Finished

Original comment by schulte....@gmail.com on 23 Nov 2009 at 3:50

GoogleCodeExporter commented 9 years ago
ccp.format.general.formatIO, ccpnmr.format.converters.DataFormat, and 
linkNmrStarData.py have already been 
updated. 

The log file is attached.

Original comment by schulte....@gmail.com on 23 Nov 2009 at 5:03

Attachments:

GoogleCodeExporter commented 9 years ago
When I run 
python linkNmrStarData.py -raise -force -noGui 2kb3 > & temp
in the command line, I get a different error stack (see attached).  I am trying 
to place debug print statements to 
figure out what is going on, but they aren't showing up.

Original comment by schulte....@gmail.com on 23 Nov 2009 at 5:07

Attachments:

GoogleCodeExporter commented 9 years ago
Nothing is processing anymore and I'm having trouble tracing past startup.

Original comment by schulte....@gmail.com on 23 Nov 2009 at 6:04

GoogleCodeExporter commented 9 years ago
Is this on tang or is grunt messed up?

Original comment by jurge...@gmail.com on 23 Nov 2009 at 6:57

GoogleCodeExporter commented 9 years ago
both

Original comment by schulte....@gmail.com on 23 Nov 2009 at 6:59

GoogleCodeExporter commented 9 years ago
Can you try reverting to a previous state (revision) on grunt so production is 
ok again?
I'll look tomorrow on tang to see if I can make sense out of the FC log ok?

Original comment by jurge...@gmail.com on 23 Nov 2009 at 7:03

GoogleCodeExporter commented 9 years ago
That's what I was planning on if I couldn't trace the problem. It seemed to 
only happen after I updated NmrStarFormat.py.

Original comment by schulte....@gmail.com on 23 Nov 2009 at 7:09

GoogleCodeExporter commented 9 years ago
Set NmrStarFormat.py back to -r 1.79 and processing works again on grunt.

Original comment by schulte....@gmail.com on 23 Nov 2009 at 8:07

GoogleCodeExporter commented 9 years ago
The error in the temp file is because you forgot the PDB ID before the -raise 
when
running linkNmrStarData.py

The other error occurs because it is actually trying to create a new chemComp 
and
fails. I'll look into it.

Original comment by wfvran...@gmail.com on 24 Nov 2009 at 8:10

GoogleCodeExporter commented 9 years ago
OK done. Again though, I had to change the 2kb3 file to be able to test this - 
I do
automatically get the Tpo file. That should really be working.

Anyway for this to work you'll have to update:

ccpnmr/format/converters/DataFormat.py
ccpnmr/format/process/linkResonances.py
msd/nmrStar/IO/Ccpn_To_NmrStar.py

Original comment by wfvran...@gmail.com on 24 Nov 2009 at 11:12

GoogleCodeExporter commented 9 years ago
This works on tang now. The new saveframe, CNS/XPLOR_distance_constraints_2, 
has the PDB tags (e.g. 
PDB_strand_id_1, PDB_residue_no_1, etc.) whereas grunt does not. Should we wait 
until this is finalized before we 
update grunt?

Original comment by schulte....@gmail.com on 24 Nov 2009 at 7:04

GoogleCodeExporter commented 9 years ago
So it makes it all the way thru including the remediated and FRED ?
If so, go ahead.
Cheers!

Original comment by jurge...@gmail.com on 24 Nov 2009 at 7:11

GoogleCodeExporter commented 9 years ago
I had asked a week or more ago that Grunt not be updated until everything was
finalized. This is the point of having a production system. Grunt was working 
fine.
Please do not update grunt until Wim has finalized his code and the tags are 
all set.

Thanks,
Eldon

Original comment by webmas...@bmrb.wisc.edu on 24 Nov 2009 at 8:53

GoogleCodeExporter commented 9 years ago
This has been fixed for a while. I'm going to close it now.

Original comment by schulte....@gmail.com on 28 Apr 2010 at 8:06