google-code-export / nmrrestrntsgrid

Automatically exported from code.google.com/p/nmrrestrntsgrid
0 stars 0 forks source link

Get correspondence between BMRB and PDB NMR entries of the same molecule/study #135

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
From BMRB:

The current file that has similar information on the BMRB ftp site is:
ftp://ftp.bmrb.wisc.edu/pub/webdata/dbmatch.csv
However, this file may have far more hits than you want.

-----------------------

But the file doesn't contain the e.g. 4020 1brv link. In fact there are no pdb 
id's in there.

Original issue reported on code.google.com by jurge...@gmail.com on 15 Nov 2008 at 6:56

GoogleCodeExporter commented 9 years ago
I guess what went wrong is that my browser wasn't finished loading the file; 
strange. Sorry for spamming.

Anyway, I do find it now. There seem to be 51542 hits between PDB and BMRB 
based on: ??blast only??. 
I found the number doing: grep '^\"PDB' dbmatch.csv | wc

Before from my run on 2005-05-17 I had less than 2,000 correspondences based on 
 the procedure 
described in the attached. Basically only selecting NMR PDB entries. Doing a 
BLAST and selecting for date and 
author names.

Clearly the current BMRB file contains links to e.g. Xray structures (e.g. 
9rat). Human Lysozyme e.g. BMRB 
entry 5142 has 178 hits for this one entry.

What's needed is a file that has the correspondence as reported by the authors 
where for the same structure 
determination a PDB file was deposited and also a BMRB entry exists. In the 
previous iteration of the NRG we 
had only one BMRB entry linked to from one PDB entry just like the PDB itself 
and other sites link to only 1 
BMRB entry from their sites.

Eldon, can I ask you to provide such a file? It will help more than just the 
NRG project I would think.

Cheers!

Original comment by jurge...@gmail.com on 16 Nov 2008 at 9:26

Attachments:

GoogleCodeExporter commented 9 years ago
I don't have a backup of the file score_many2one.csv

Dmitri could you try to restore a copy of the tang directory:
/big/jurgen/BMRB/Matches_BMRB_PDB/results or
/var/www/servlet_data/viavia/bmrb_pdb_match

from a date somewhere around August 2007?

Original comment by jurge...@gmail.com on 20 Nov 2008 at 10:09

GoogleCodeExporter commented 9 years ago
As you recall we re-use backup tapes. We never had > 1 year worth of backups, 
and
lately it's been down to about 8 months. In other words, no, I can't restore a 
file
from Aug 2007.

Original comment by dmitri.m...@gmail.com on 20 Nov 2008 at 9:28

GoogleCodeExporter commented 9 years ago
How about bucky backup?

Original comment by jurge...@gmail.com on 21 Nov 2008 at 8:57

GoogleCodeExporter commented 9 years ago
They never were in bucky backup. /share/jurgen/BMRB is, so if you have a copy 
in there
I can restore it.

Original comment by dmitri.m...@gmail.com on 21 Nov 2008 at 7:35

GoogleCodeExporter commented 9 years ago
I can pull dbmatch.csv out of CVS and you can filter out the rows where "blast 
flag"
column is a number (not "Y" or "N"). The numbers are from score_one2one.csv, not
many2one.csv, so I don't know if they're of any use.

I can also generate a bunch of csv files from blast search, but you'll have to 
run them
through your code and probably fix a few things to make it work (we talked 
about that
when you were here).

Original comment by dmitri.m...@gmail.com on 21 Nov 2008 at 7:48

GoogleCodeExporter commented 9 years ago
Perhaps I can query the old NRG for it... Nop, it's not in there anymore.

Ok, found a March 2006 backup, as attached. It's good to use.

Dmitri, I see that there still is a 
/bmrb/admin/ETS_text_exports/ETS-Entry_log.txt file that's up to date.
I used to use that to complement the hits using the script:
/big/jurgen/BMRB/Matches_BMRB_PDB/scripts/getEtsData.csh
It's broke now probably because the ETS file changed. If you could rewrite it 
to extract all the ETS matches 
between released BMRB/PDB then I will use that data to keep NRG up to date 
every week.

Thanks

Original comment by jurge...@gmail.com on 25 Nov 2008 at 9:25

Attachments:

GoogleCodeExporter commented 9 years ago
Written integration code for new setup regarding this issue.

All that's needed are updates from ETS. Closing this issue anyway.

To update the links issue:
java -Xmx128m Wattos.Episode_II.MRUpdateLinksToExternalDBs 
$WS/nmrrestrntsgrid/bmrb_pdb_match/results

Original comment by jurge...@gmail.com on 25 Nov 2008 at 2:34

GoogleCodeExporter commented 9 years ago
Actually processed now. Very nice to see this working again!

Original comment by jurge...@gmail.com on 25 Nov 2008 at 4:48

GoogleCodeExporter commented 9 years ago
I think I only put in NRG those correspondences I had a long time ago. 
It sure would be nice to have updates since 2007 when I left the USA.

The problem is that my old script querying the ETS source (main one needed) is 
failing.
From
http://code.google.com/p/nmrrestrntsgrid/source/browse/trunk/nmrrestrntsgrid/bmr
b_pdb_match/scripts/
getEtsData.csh

Even the first line fails:
/~/ cat $ETS_file_loc | gawk '{if((NR>6)&&($3=="rel")) print}'
/~/ echo $ETS_file_loc 
/bmrb/admin/ETS_text_exports/ETS-Entry_log.txt

Even though the file exists and seems to be current.

D., could you look into this? If you give me a csv file updated from ETS I'll 
make sure it will get into NRG.
I should simply look like:
bmrb_id,pdb_id
4020,1brv

etc.

Original comment by jurge...@gmail.com on 30 Jan 2009 at 3:52

GoogleCodeExporter commented 9 years ago

Original comment by jurge...@gmail.com on 30 Jan 2009 at 3:54

GoogleCodeExporter commented 9 years ago
Inactivating this issue for now.

Original comment by jurge...@gmail.com on 31 Mar 2009 at 9:05