ofanoyi / pygr

Automatically exported from code.google.com/p/pygr
0 stars 0 forks source link

Excessive memory consumption in megatests #103

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
See Namshin's posts in this thread:

http://groups.google.com/group/pygr-dev/browse_thread/thread/
f4e98e20632794ab

Original issue reported on code.google.com by mare...@gmail.com on 26 Aug 2009 at 1:52

GoogleCodeExporter commented 8 years ago
Here are some comments on this issue.

Pygr uses excessive memory when it uses SQLTable or SQLTableClustered classes 
for 
NLMSA building and querying.

You have to run full version of megatest in your machine. First, you need to 
setup 
megatest machine, check necessary files at 
http://biodb.bioinformatics.ucla.edu/MEGATEST/ Some of the bugs has been fixed 
and 
placed in July2009_BugFix directory.

And you have to run annotation_hg18_megatest.py by protest.py or nose. Because 
you 
don't have to run Collection megatest, you can place sys.exit() after def 
collectionannot_test(self): The function you have to test is def mysqlannot_test
(self):

If you want to change ouptut directory for debugging, you can put testDir after 
def 
__init__(self, testDir = None): of class PygrBuildNLMSAMegabase(object):

testDir = /tmp/mytest

Symptom is that it uses more than 2GB memory when building NLMSA or querying 
NLMSA. 
It never decrease even if you set autoGC = True.

Original comment by deepr...@gmail.com on 28 Aug 2009 at 5:57

GoogleCodeExporter commented 8 years ago
This is modified version of annotation_hg18_megatest.py
You need to make .pygrrc in your running directory. Please parepare all 
necessary 
files for full megatest.

[deepreds@s137 Megatest]$ cat .pygrrc 
[megatests]
expectedRunningTime = 12
logDir = /data4/deepreds/projects/Pygr_Project/Megatest/megatest_logs
mailFrom = deepreds@s137.rna.kr
mailTo_failed = deepreds@yahoo.com
mailTo_ok = deepreds@yahoo.com
runningTimeAllowedDelay = 9
testInputDB = PYGR_MEGATESTS
testInputDir = /data4/deepreds/projects/Pygr_Project/Megatest/input_and_results
testOutputBaseDir = /data4/deepreds/projects/Pygr_Project/Megatest/megatest_tmp

[megatests_dm2]
#smallSampleKey = chrYh
#smallSampleKey_nlmsa = chr4h
mafDir = /data4/deepreds/projects/Pygr_Project/Megatest/maf_data
msaDir = /data4/deepreds/projects/Pygr_Project/Megatest/maf_test
seqDir = /data4/deepreds/projects/Pygr_Project/Megatest/seq_data

[megatests_hg18]
#smallSampleKey = chrY
axtDir = /data4/deepreds/projects/Pygr_Project/Megatest/axt_data3
mafDir = /data4/deepreds/projects/Pygr_Project/Megatest/maf_data3
msaDir = /data4/deepreds/projects/Pygr_Project/Megatest/maf_test3
seqDir = /data4/deepreds/projects/Pygr_Project/Megatest/seq_data3

Just run this script without nose/protest.py
$ python annotation_hg18_megatest.py megatest_tmp/

You can comment out a test at the bottom of the script.
myBase = PygrBuildNLMSAMegabase()
myTestBase = Build_Test()
myTestBase.test_seqdb()
myTestBase.test_collectionannot() # COMMENT OUT IF YOU DON'T WANT THIS
myTestBase.test_mysqlannot() # COMMENT OUT IF YOU DON'T WANT THIS

Original comment by deepr...@gmail.com on 31 Aug 2009 at 11:08

Attachments:

GoogleCodeExporter commented 8 years ago
Fixed a few bugs in above annotation_hg18_megatest.py

Original comment by deepr...@gmail.com on 31 Aug 2009 at 12:58

Attachments:

GoogleCodeExporter commented 8 years ago
The problem seems to be related to in the way the SQL iterator works at 
present. 
Chris has identified the problem and is working on fixing it right now so that 
the 
fix can make it into 0.8.0.

Original comment by mare...@gmail.com on 4 Sep 2009 at 8:58

GoogleCodeExporter commented 8 years ago
switched __iter__ to use generic_iterator() with a new cursor, so it won't 
invoke
keys() unless it's unable to allocate a new iterator (to guarantee query 
isolation).

Original comment by cjlee...@gmail.com on 5 Sep 2009 at 5:54

GoogleCodeExporter commented 8 years ago
branch/repo/commit info?

Original comment by the.good...@gmail.com on 7 Sep 2009 at 12:08

GoogleCodeExporter commented 8 years ago
Ahh!  found it:
http://github.com/cjlee112/pygr/commit/62e1669aa88563085df70f33b81d6eeac6c17037

I still can't figure out what branch it's in, tho.  What's a good way to do 
that?

Original comment by the.good...@gmail.com on 7 Sep 2009 at 12:12

GoogleCodeExporter commented 8 years ago
sorry this was not clear.  Everything had to be committed to master in order to 
force
it to be included in our nightly megatests (which run on master), which I dearly
wanted to see before tagging the final 0.8 release.  Marek is supposed to test 
the
fix on the specific megatest that Namshin originally reported, but I haven't 
heard
anything from him yet.

Original comment by cjlee...@gmail.com on 8 Sep 2009 at 6:35

GoogleCodeExporter commented 8 years ago

Original comment by cjlee...@gmail.com on 17 Dec 2010 at 11:41