Closed GoogleCodeExporter closed 9 years ago
OK the code is downloadable from:
http://www.ebi.ac.uk/~wim/vascoCing.tgz
You also need to check out python/pdbe/software from the SF CVS.
Couple of notes:
- You need to change some variables in the top part of the
VascoCingReferenceCheck class...
- cingDataDir: the directory path where the CING projects live
- dsspDataDirName: the directory (relative to cingDataDir/pdbCode) where the DSSP data lives
- whatIfDataDirName: the directory (relative to cingDataDir/pdbCode) where the WHATIF data lives
- check the setupDirectories method, this is where the paths to the data
relative to the pdbcode are set.
- I don't save the CCPN project at the end of the script, you'll have to build
this in or do it later as part of the pipeline
- The corrections are stored as application data with the shift list(s), see
the tagProject method. Note that I'm just saving the H, N and aliphatic C
corrections - the corrections for carbons at high ppm are not so dependable.
So you run it as:
python vascoCingRefCheck.py <pdbcode>
Original comment by wfvran...@gmail.com
on 1 Mar 2011 at 9:36
Dank Wim, I'll finish the SSA issue first and continue with this one then.
Original comment by jurge...@gmail.com
on 1 Mar 2011 at 9:42
Missing:
from pdbe.adatah.WhatIf import getWhatIfInfo
Looking at
http://ccpn.cvs.sourceforge.net/viewvc/ccpn/ccpn/python/pdbe/adatah/?pathrev=sta
ble
I think you might have forgotten to check that one in.
Original comment by jurge...@gmail.com
on 4 Mar 2011 at 9:48
Do we need analysis' CCPN setup for this instead? Cause the following are also
missing:
from pdbe.analysis.Util import getPickledDict
from pdbe.analysis.external.stride.Util import StrideInfo
from pdbe.analysis.Constants import protonToHeavyAtom
from pdbe.analysis.shifts.reref.estimateReferences import
estimate_reference_single, make_selection, make_sel3, select_entries
Original comment by jurge...@gmail.com
on 4 Mar 2011 at 9:51
Nope not analysis, but some of my code to analyse data...
So try the following steps:
1. Check out pdbe/software/vascoReferenceCheck.py again from SF CVS
2. Unpack attached .tgz in the pdbe/ directory - an analysis/ directory should
be created therein
3. Make sure numpy is installed
That should work...
Original comment by wfvran...@gmail.com
on 4 Mar 2011 at 1:41
Attachments:
Still got a couple of imports missing as per attached.
E.g.
from pdbe.adatah.WhatIf import getWhatIfInfo
from pdbe.analysis.external.stride.Util import StrideInfo
from pdbe.analysis.shifts.DataCreation import CombinePdbBmrbData
Also, can you comit in the future to ccpn at sf.net? Otherwise I suggest I add
it to the CING google code so we can keep track. The CCPN setup is difficult
enough as it is.
Original comment by jurge...@gmail.com
on 4 Mar 2011 at 2:53
Attachments:
OK sorry about that, again... but yes I should check it in, but I need to
reorganise the code first - not all of this should be on SF. I'll look into it.
Original comment by wfvran...@gmail.com
on 7 Mar 2011 at 11:50
OK the WhatIf and Stride imports are not relevant - these are in methods that
are overwritten in the CING-specific code I sent you. I've checked in a new
version of pdbe.software.vascoReferenceCheck that removes the other dependency,
try again!
Original comment by wfvran...@gmail.com
on 7 Mar 2011 at 12:33
Ok, I've updated the CCPN sf.net again.
I get the strangest error below when I execute:
Read pdbe.adatah.localConstants.py version 4.5.7
Warning: Project file has moved from
"/Users/jd/Desktop/Werk/1brv_cs_pk_2mdl"
to
"/Users/jd/Desktop/1brv/1brv"
Backup is being changed from
"/Users/jd/Desktop/Werk/1brv_cs_pk_2mdl_backup"
to
"/Users/jd/Desktop/1brv/1brv_backup"
refData has been changed from
"/Users/jd/workspace35/ccpnmr/ccpnmr2.1/data"
to
"/Users/jd/workspace35/ccpn/data"
########################################
# VASCO: calculating rereferencing... #
########################################
Traceback (most recent call last):
File "/Users/jd/workspace35/cing/python/cing/Scripts/FC/vascoCingRefCheck.py", line 225, in <module>
vascoReferenceCheck.checkAllShiftLists()
File "/Users/jd/workspace35/cing/python/cing/Scripts/FC/vascoCingRefCheck.py", line 189, in checkAllShiftLists
self.checkProject(ccpnProject=ccpnProject,shiftListSerial=shiftList.serial)
File "/Users/jd/workspace35/ccpn/python/pdbe/software/vascoReferenceCheck.py", line 79, in checkProject
self.selectShiftList(shiftListSerial=shiftListSerial)
File "/Users/jd/workspace35/ccpn/python/pdbe/software/vascoReferenceCheck.py", line 144, in selectShiftList
self.shiftList = self.ccpnProject.currentNmrProject.findFirstMeasurementLists(className='ShiftList',serial=shiftListSerial)
AttributeError: 'NmrProject' object has no attribute 'findFirstMeasurementLists'
Is this because of an old project?
Also please note I had to add several __init__.py files, all the way to
pdbe.analysis.shifts.reref
Original comment by jurge...@gmail.com
on 7 Mar 2011 at 12:59
Indeed __init__.py I forgot to include.
As for the error, strange that I never got that... anyway fixed, should've been
without the trailing 's'.
Original comment by wfvran...@gmail.com
on 8 Mar 2011 at 7:56
We're getting a bit further now.
Can you send me some output for input from:
http://nmr.cmbi.ru.nl/NRG-CING/data/br/1brv/ or another entry you prefer?
I've embedded your code to:
http://code.google.com/p/cing/source/browse/trunk/cing/python/cing/Scripts/FC/va
scoCingRefCheck.py?spec=svn945&r=945
Using $CINGROOT/python/cing/Scripts/FC/vascoCingRefCheck.py 1brv
I still get a bug:
Using vascoRefDataPath
/Users/jd/workspace35/ccpn/python/pdbe/analysis/shifts/reref/data
In CING using vascoRefDataPath vascoRefData
In CING using vascoRefDataPath vascoRefData
==> Restoring <Project 1brv> ...
==> Restoring Wattos results
==> Restoring whatif results
==> Restoring DSSP results
==> Restoring talos+ results
==> Restoring procheck results
==> Restoring queeny results
==> Restoring shiftx results
==> CheckForSaltbridges distant: 0 skipped: 7 below cutoff 1 present 1 total
considered 9
==> Found assigned/overall/fraction for spins: 13C 73/152/0.48 15N
0/44/0.00 1H 182/199/0.91
==> Only spins with fraction >= 0.85 will be flagged when missing: {'13C':
False, '1H': True, '15N': False}
==> Generating Macros
Finished restoring project <Project 1brv>
Warning: Project file has moved from
"/Volumes/tera4/NRG-CING/prep/S/br/1brv/1brv"
to
"/Users/jd/Desktop/1brv/1brv"
Backup is being changed from
"/Volumes/tera4/NRG-CING/prep/S/br/1brv/1brv_backup"
to
"/Users/jd/Desktop/1brv/1brv_backup"
########################################
# VASCO: calculating rereferencing... #
########################################
Fetching DSSP secondary structure info...
Fetching WHATIF per-atom surface accessibility info...
Traceback (most recent call last):
File "/Users/jd/workspace35/cing/python/cing/Scripts/FC/vascoCingRefCheck.py", line 227, in <module>
vascoReferenceCheck.checkAllShiftLists()
File "/Users/jd/workspace35/cing/python/cing/Scripts/FC/vascoCingRefCheck.py", line 191, in checkAllShiftLists
self.checkProject(ccpnProject=ccpnProject,shiftListSerial=shiftList.serial)
File "/Users/jd/workspace35/ccpn/python/pdbe/software/vascoReferenceCheck.py", line 99, in checkProject
self.getVascoRerefInfo()
File "/Users/jd/workspace35/ccpn/python/pdbe/software/vascoReferenceCheck.py", line 494, in getVascoRerefInfo
useBounds = bounds[atom_type]
KeyError: 'H'
I attach a revision marking another problem with structureEnsembleId usage on
line 113 in the superclass. It doesn't seem to be important and it gets defined
outside it's scope at:
def
checkProject(self,ccpnProject=None,ccpnDir=None,structureEnsembleId=None,shiftLi
stSerial=None):
Is it important?
Original comment by jurge...@gmail.com
on 10 Mar 2011 at 10:02
Attachments:
It's probably not finding the dictionary with the reference information, is
there (from the place you're running the script) a vascoRefData/ directory with
a bounds and a stats .pp file?
Otherwise hardcode the location of the vascoRefData/ directory in the
vascoRefDataPath variable at the top of the VascoCingReferenceCheck class in
vascoCingRefCheck.py
Original comment by morebrus...@gmail.com
on 10 Mar 2011 at 10:45
Thanks for pointing out the structureEnsembleId problem, now fixed.
Original comment by morebrus...@gmail.com
on 10 Mar 2011 at 10:48
By the way both above are mine, obviously, ignore the email address, keep on
using this one!
Original comment by wfvran...@gmail.com
on 10 Mar 2011 at 10:49
On comment 13: Thanks I got the revision in.
Original comment by jurge...@gmail.com
on 10 Mar 2011 at 12:00
Great runs fine now. Questions still:
- How to run without gui?
- I miss the decision code on which to apply. In my notes I saw we would do
that when:
some value reached -3 for poor and -4 for bad but that was per atom
Comparing 1ieh with 4969 (See
http://nmr.cmbi.ru.nl/NRG-CING/prep/S/ie/1ieh/1ieh_starCS2Ccpn.log)
Vasco (http://www.ebi.ac.uk/pdbe-apps/nmr/data/vasco/bmr4969.1ieh.vasco):
# CORRECTION C (aliphatic) 1.569 +/- 0.079 APPLIED
# CORRECTION C (high ppm, proton attached) -0.000 +/- 0.000 NOT
CALCULATED - NOT ENOUGH DATA
# CORRECTION C (high ppm, no proton) -0.000 +/- 0.000 NOT
CALCULATED - NOT ENOUGH DATA
# CORRECTION N 0.317 +/- 0.284 NOT APPLIED -
UNCERTAIN
# CORRECTION H -0.000 +/- 0.010 ORIGINAL
CORRECT
CING integrated:
('C', 1) (None, None)
('C', 2) (None, None)
('C', 3) (-1.6413276878937664, 0.058340678594975978)
('C', 4) (None, None)
('H', None) (-0.015514588071506744, 0.0091816571211084108)
('N', None) (-0.3314271261198754, 0.24086810646836765)
Looking good. Should I presume the original CS for C (aliphatic) needs to be
subtracted by 1.64?
What do the warnings below mean?
jd:stella/1ieh/ $CINGROOT/python/cing/Scripts/FC/vascoCingRefCheck.py 1ieh
==> Restoring <Project 1ieh> ...
==> Restoring Wattos results
==> Restoring whatif results
==> Restoring DSSP results
==> Restoring talos+ results
==> Restoring procheck results
==> Restoring queeny results
==> Restoring shiftx results
==> CheckForSaltbridges distant: 100 skipped: 0 below cutoff 71 present 21
total considered 192
==> Found assigned/overall/fraction for spins: 13C 360/637/0.57 15N
140/182/0.77 1H 731/858/0.85
==> Only spins with fraction >= 0.85 will be flagged when missing: {'13C':
False, '1H': True, '15N': False}
==> Generating Macros
Finished restoring project <Project 1ieh>
Warning: Project file has moved from
"/Volumes/tera4/NRG-CING/prep/S/ie/1ieh/1ieh"
to
"/Users/jd/Desktop/1ieh/1ieh"
Backup is being changed from
"/Volumes/tera4/NRG-CING/prep/S/ie/1ieh/1ieh_backup"
to
"/Users/jd/Desktop/1ieh/1ieh_backup"
########################################
# VASCO: calculating rereferencing... #
########################################
Fetching DSSP secondary structure info...
Fetching WHATIF per-atom surface accessibility info...
Warning: resetting final class to include all values for Gln, C, ('HE21',)
Warning: resetting final class to include all values for Gln, E, ('HE21',)
Warning: resetting final class to include all values for Gln, E, ('HG2',)
Warning: resetting final class to include all values for Gln, E, ('H',)
Warning: resetting final class to include all values for Gln, C, ('HG3',)
Warning: resetting final class to include all values for Gln, C, ('HE22',)
Warning: resetting final class to include all values for Gln, E, ('HE22',)
Warning: resetting final class to include all values for Pro, T, ('HA',)
Warning: resetting final class to include all values for Lys, C, ('HG2',)
Warning: resetting final class to include all values for Lys, C, ('HG3',)
Warning: resetting final class to include all values for Ile, C, ('H',)
Warning: resetting final class to include all values for Ala, E, ('HB1', 'HB2', 'HB3')
Warning: resetting final class to include all values for Asp, C, ('HA',)
Warning: resetting final class to include all values for Leu, E, ('HG',)
Warning: resetting final class to include all values for Leu, E, ('HD21', 'HD22', 'HD23')
Warning: resetting final class to include all values for Leu, E, ('H',)
Warning: resetting final class to include all values for Arg, E, ('HD3',)
Warning: resetting final class to include all values for Arg, E, ('HD2',)
Warning: resetting final class to include all values for Trp, E, ('HB2',)
Warning: resetting final class to include all values for Trp, C, ('HA',)
Warning: resetting final class to include all values for Trp, C, ('HZ3',)
Warning: resetting final class to include all values for Trp, E, ('HZ3',)
Warning: resetting final class to include all values for Trp, C, ('HH2',)
Warning: resetting final class to include all values for Trp, E, ('HH2',)
Warning: resetting final class to include all values for Trp, C, ('HZ2',)
Warning: resetting final class to include all values for Trp, E, ('HB3',)
Warning: resetting final class to include all values for Trp, C, ('HE1',)
Warning: resetting final class to include all values for Trp, E, ('H',)
Warning: resetting final class to include all values for Trp, C, ('HE3',)
Warning: resetting final class to include all values for Trp, E, ('HE3',)
Warning: resetting final class to include all values for Trp, C, ('HD1',)
Warning: resetting final class to include all values for Glu, E, ('HA',)
Warning: resetting final class to include all values for Tyr, E, ('HA',)
Warning: resetting final class to include all values for Tyr, E, ('H',)
Warning: resetting final class to include all values for Tyr, E, ('HE1',)
Warning: resetting final class to include all values for Tyr, E, ('HD1',)
Warning: resetting final class to include all values for Asn, E, ('CB',)
Warning: resetting final class to include all values for Pro, T, ('CA',)
Warning: resetting final class to include all values for Ala, E, ('CB',)
Warning: resetting final class to include all values for Ile, C, ('CD1',)
Warning: resetting final class to include all values for Ile, C, ('CG2',)
Warning: resetting final class to include all values for Leu, E, ('CD2',)
Warning: resetting final class to include all values for Leu, E, ('CB',)
Warning: resetting final class to include all values for Leu, E, ('CG',)
Warning: resetting final class to include all values for Arg, E, ('CD',)
Warning: resetting final class to include all values for Trp, C, ('CA',)
Warning: resetting final class to include all values for Trp, C, ('CB',)
Warning: resetting final class to include all values for Trp, E, ('CB',)
Warning: resetting final class to include all values for Glu, E, ('CA',)
Warning: resetting final class to include all values for Asp, C, ('CA',)
Warning: resetting final class to include all values for Tyr, E, ('CA',)
Warning: resetting final class to include all values for Gln, E, ('N',)
Warning: resetting final class to include all values for Gln, C, ('NE2',)
Warning: resetting final class to include all values for Gln, E, ('NE2',)
Warning: resetting final class to include all values for Ile, C, ('N',)
Warning: resetting final class to include all values for Leu, E, ('N',)
Warning: resetting final class to include all values for Arg, E, ('NE',)
Warning: resetting final class to include all values for Trp, E, ('N',)
Warning: resetting final class to include all values for Trp, C, ('NE1',)
Warning: resetting final class to include all values for Tyr, E, ('N',)
('C', 1) (None, None)
('C', 2) (None, None)
('C', 3) (-1.6413276878937664, 0.058340678594975978)
('C', 4) (None, None)
('H', None) (-0.015514588071506744, 0.0091816571211084108)
('N', None) (-0.3314271261198754, 0.24086810646836765)
[<memops.Implementation.AppDataFloat {application='VASCO',
keyword='correction_H', value=0.015514588071506744}>,
<memops.Implementation.AppDataFloat {application='VASCO',
keyword='correctionError_H', value=0.0091816571211084108}>,
<memops.Implementation.AppDataFloat {application='VASCO',
keyword='correction_N', value=0.3314271261198754}>,
<memops.Implementation.AppDataFloat {application='VASCO',
keyword='correctionError_N', value=0.24086810646836765}>,
<memops.Implementation.AppDataFloat {application='VASCO',
keyword='correction_C_aliphatic', value=1.6413276878937664}>,
<memops.Implementation.AppDataFloat {application='VASCO',
keyword='correctionError_C_aliphatic', value=0.058340678594975978}>]
Original comment by jurge...@gmail.com
on 10 Mar 2011 at 12:33
It looks like His CG is included with the aliphatics from the code because it
isn't excluded from either group1 or:
group2 = {'his': ('cd2', 'ce1'),
Is that correct or should it simply be added to group2 still?
Original comment by jurge...@gmail.com
on 11 Mar 2011 at 2:45
Ah it belongs to group1, in any case, but probably don't have enough data for
it anyway so shouldn't matter much.
Original comment by wfvran...@gmail.com
on 11 Mar 2011 at 3:55
Below are my numbered notes:
- Because of the dependency on CING generated data from What If accessibility
and DSSP I had to
move this step to after these are normally run instead of as a prep. step.
- Please update cing to r953 or higher.
cd $WS/cing
svn update .
- I fixed bugs:
- In $CINGROOT/python/cing/Scripts/FC/vascoCingRefCheck.py Median of single element array got out of bounds.
medianIndex = int((len(asaList) / 2.0) + 0.5) # fails with round off on single element lists.
- None can't be negated.
File "/Users/jd/workspace35/ccpn/python/pdbe/software/vascoReferenceCheck.py", line 503, in getVascoRerefInfo
self.rerefInfo[(atom_type,i)] = (-rerefValue,rerefError)
TypeError: bad operand type for unary -: 'NoneType'
This fix is committed to CCPN version 1.1.2.8 of above file.
- run the test case for 1ieh (takes ~1 minute). All required input lives in the
update cing distribution.
python -u $CINGROOT/python/cing/PluginCode/test/test_Vasco.py
The vasco results of the web site are nicely reproduced even though just 1 model was used.
('C', 3) (1.6303584069470283, 0.059139812898406918)
('H', None) (0.0035894710534362662, 0.0094755686074762049)
('N', None) (0.47356908698086786, 0.25395166326170021)
For debugging purposes I always apply the reref values.
- Why not exclude hydrogen atom ASAs? Or is that to come in the code I didn't
see?
- Please fix bug:
For Chris' entry 1cjg with BMRB 4813 the Vasco corrections seem to be the same for the 2 different
CS sets inside the entry. One is for protein and the other for DNA even.
Entry is included in the new cing distribution for testing with test_Vasco.py
Original comment by jurge...@gmail.com
on 24 Mar 2011 at 2:34
We also need to decide how to include the code now uncommited in:
$CCPNROOT/python pdbe.analysis.Util etc.
so that I can push this from development to production.
Original comment by jurge...@gmail.com
on 28 Mar 2011 at 8:49
Replies to points from comment 19:
- Why not exclude hydrogen atom ASAs?
For the hydrogens I use the heavy atom ASA - there is no such thing calculated for the hydrogens
- Please fix bug:
The main issue is that VASCO cannot re-reference DNA shifts (at the moment) because there's not enough statistical data available, so in this case the problem is probably that the values from the first (protein) VASCO check are retained?
Also note (I think this was a question in another issue) that I double-checked whether all shift lists are read in, and this is definitely the case.
Original comment by wfvran...@gmail.com
on 28 Mar 2011 at 10:53
On comment 20, true. I will have a think and see how I can arrange this. Some
code might just have to stay within the CING repository.
Original comment by wfvran...@gmail.com
on 28 Mar 2011 at 10:56
On comment 22, this is no problem. We can (temporarily) host the
pdbe.analysis.Util package etc. under CING. Just let me know.
Original comment by jurge...@gmail.com
on 28 Mar 2011 at 10:59
OK, I've got the following dependencies not in SF CVS, matches with yours?
vascoRefData/bounds_20100225.pp
vascoRefData/stats_20100225.pp
analysis/Util.py
analysis/Constants.py
analysis/shifts/reref/estimateReferences.py
analysis/shifts/reref/statpack.py
Original comment by wfvran...@gmail.com
on 28 Mar 2011 at 11:09
The bounds are already in CING. I've also added the 3 required __init__.py
files in the analysis package to CCPN. Should I go ahead and put them in CING
or do you want to?
Original comment by jurge...@gmail.com
on 28 Mar 2011 at 11:14
Go ahead! Just thinking about it, the analysis/ stuff isn't for the SF CVS
anyway, so just put it in CING only for the moment.
Original comment by wfvran...@gmail.com
on 28 Mar 2011 at 11:17
Wait, the code in vascoReferenceCheck (in CCPN) requires the code in
pdbe.analysis
from pdbe.analysis.Util import getPickledDict
from pdbe.analysis.Constants import protonToHeavyAtom
from pdbe.analysis.shifts.reref.estimateReferences import
estimate_reference_single, make_selection, make_sel3, select_entries
Can I move vascoReferenceCheck.py ? It's not referenced elsewhere in CCPN.
Original comment by jurge...@gmail.com
on 28 Mar 2011 at 11:33
Take a look at r957. I had to rename pdbe to pdbe2 to avoid the name conflict
in CING.
I do get a non-interesting difference between my develop & production versions
for 1ieh. I don't have time to pursue it further now.
Develop:
Skipping uncertain correction for H_None of rerefNTvalue 0.013 (+- 0.010)
Skipping uncertain correction for N_None of rerefNTvalue 0.299 (+- 0.263)
Applying Vasco correction for atomId C_3 and rerefTuple 1.617 (+- 0.061) to
resonance in 360 atoms
Prod:
Skipping uncertain correction for H_None of rerefNTvalue 0.014 (+- 0.009)
Skipping uncertain correction for N_None of rerefNTvalue 0.372 (+- 0.252)
Applying Vasco correction for atomId C_3 and rerefTuple 1.623 (+- 0.057) to
resonance in 360 atoms
Wim can you look into: 'the problem is probably that the values from the first
(protein) VASCO check are retained?'
Just run your code on 1cjg or another entry with 2 lists.
Btw, I store the Vasco meta data to RDB so we can compare with your paper at
some point.
Original comment by jurge...@gmail.com
on 28 Mar 2011 at 12:38
For the 2 shiftlist issue, there was a bug in vascoReferenceCheck in the
selection of Shift values, now fixed - check out of SF CVS.
Then feel free to put vascoReferenceCheck in the CING rep for now... as for the
non-interesting difference, could be a precision issue with floats.
Original comment by wfvran...@gmail.com
on 28 Mar 2011 at 3:14
Tested well, committed fix to r958. I'll close this issue fixed.
Thanks Wim.
Original comment by jurge...@gmail.com
on 29 Mar 2011 at 11:23
Can I remove the copy of vascoReferenceCheck in CCPN?
Original comment by jurge...@gmail.com
on 29 Mar 2011 at 1:28
No - I might change the location later, but it needs to be somewhere else than
CING in a repository.
Original comment by wfvran...@gmail.com
on 29 Mar 2011 at 1:30
Ok, please inform me when I need to manually keep it in sync with the one we
use for CING.
Original comment by jurge...@gmail.com
on 29 Mar 2011 at 1:51
Original issue reported on code.google.com by
jurge...@gmail.com
on 25 Feb 2011 at 9:19