Closed HenryLeongStat closed 7 years ago
After I run the first time, I interrupt it by Ctrl + C and then try to run again in the same folder, it starts processing without error messages like the followings:
It works at the second times (after the error message showed up at the first time), and the result should be the same as the test at sqlite
:
According to the output, there are quite a lot of rows in the results are different between hdf5 and sqlite.
It looks like sql data has less variants than hdf5 -- perhaps hdf5 version did not implement the variant filters by missingness or something?
Error messages still show up after I compile successfully on Linux server:
mleong@q1prpfs04:~/VariantTools/j10htestVarianttools$ vtools associate variant smoking --discard_variants "%(NA)>0.1" --HDF --method "BurdenBt --name BurdenTest --alternative 2" --group_by refGene.name2 -j 10 --force -v 2 > asso_test_j10_hdf5.log
DEBUG:
DEBUG: associate variant smoking --discard_variants %(NA)>0.1 --HDF --method "BurdenBt --name BurdenTest --alternative 2" --group_by refGene.name2 -j 10 --force -v 2
DEBUG: Using temporary directory /tmp/tmpek6917wr/_tmp_628095
DEBUG: Select phenotype and covariates using query SELECT sample_id, sample_name, smoking FROM sample LEFT OUTER JOIN filename ON sample.file_id = filename.file_id WHERE smoking IS NOT NULL
INFO: 2504 samples are found
DEBUG: Running query INSERT INTO __asso_tmp SELECT DISTINCT variant.variant_id, 0, refGene.refGene.name2 FROM variant, refGene.__rng_refGene_hg19_chr_txStart_txEnd, refGene.refGene WHERE (variant.bin = refGene.__rng_refGene_hg19_chr_txStart_txEnd.bin AND variant.chr = refGene.__rng_refGene_hg19_chr_txStart_txEnd.chr AND variant.pos >= refGene.__rng_refGene_hg19_chr_txStart_txEnd.start AND variant.pos <= refGene.__rng_refGene_hg19_chr_txStart_txEnd.end ) AND (refGene.refGene.rowid = refGene.__rng_refGene_hg19_chr_txStart_txEnd.range_id);
INFO: Grouping variants by 'refGene.name2', please be patient ...
INFO: 573 groups are found
Process GroupHDFGenerator-3:
Process GroupHDFGenerator-9:
Process GroupHDFGenerator-4:
Process GroupHDFGenerator-5:
Process GroupHDFGenerator-8:
Traceback (most recent call last):
File "/home/mleong/anaconda3/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/mleong/anaconda3/lib/python3.6/site-packages/variant_tools-3.0.0.dev0-py3.6-linux-x86_64.egg/variant_tools/association_hdf5.py", line 78, in run
for row in cur.execute(select_group):
Traceback (most recent call last):
vt_sqlite3.OperationalError: unable to open database file
File "/home/mleong/anaconda3/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/mleong/anaconda3/lib/python3.6/site-packages/variant_tools-3.0.0.dev0-py3.6-linux-x86_64.egg/variant_tools/association_hdf5.py", line 78, in run
for row in cur.execute(select_group):
Traceback (most recent call last):
vt_sqlite3.OperationalError: unable to open database file
File "/home/mleong/anaconda3/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/mleong/anaconda3/lib/python3.6/site-packages/variant_tools-3.0.0.dev0-py3.6-linux-x86_64.egg/variant_tools/association_hdf5.py", line 78, in run
for row in cur.execute(select_group):
Traceback (most recent call last):
vt_sqlite3.OperationalError: unable to open database file
File "/home/mleong/anaconda3/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/mleong/anaconda3/lib/python3.6/site-packages/variant_tools-3.0.0.dev0-py3.6-linux-x86_64.egg/variant_tools/association_hdf5.py", line 78, in run
for row in cur.execute(select_group):
vt_sqlite3.OperationalError: unable to open database file
Traceback (most recent call last):
File "/home/mleong/anaconda3/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/mleong/anaconda3/lib/python3.6/site-packages/variant_tools-3.0.0.dev0-py3.6-linux-x86_64.egg/variant_tools/association_hdf5.py", line 78, in run
for row in cur.execute(select_group):
vt_sqlite3.OperationalError: unable to open database file
^CTraceback (most recent call last):
File "/home/mleong/anaconda3/lib/python3.6/site-packages/variant_tools-3.0.0.dev0-py3.6-linux-x86_64.egg/variant_tools/association.py", line 1202, in associate
generateHDFbyGroup(asso,nJobs)
File "/home/mleong/anaconda3/lib/python3.6/site-packages/variant_tools-3.0.0.dev0-py3.6-linux-x86_64.egg/variant_tools/association_hdf5.py", line 128, in generateHDFbyGroup
groupHDFGenerator.join()
File "/home/mleong/anaconda3/lib/python3.6/multiprocessing/process.py", line 121, in join
res = self._popen.wait(timeout)
File "/home/mleong/anaconda3/lib/python3.6/multiprocessing/popen_fork.py", line 51, in wait
return self.poll(os.WNOHANG if timeout == 0.0 else 0)
File "/home/mleong/anaconda3/lib/python3.6/multiprocessing/popen_fork.py", line 29, in poll
pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt
ref: #50
For the test that I used -j10
to import, the error messages show up when I use -j10
to do the association test
. However, it works OK when I use -j4
to do the association test
.
And I try to run -j4
to do the association test
again, the error shows up this time:
mleong@q1prpfs04:~/VariantTools/j10htestVarianttools$ vtools associate variant smoking --discard_variants "%(NA)>0.1" --HDF --method "BurdenBt --name BurdenTest --alternative 2" --group_by refGene.name2 -j 4 --force -v 2 > asso_test_j10_hdf5.log
DEBUG:
DEBUG: associate variant smoking --discard_variants %(NA)>0.1 --HDF --method "BurdenBt --name BurdenTest --alternative 2" --group_by refGene.name2 -j 4 --force -v 2
DEBUG: Using temporary directory /tmp/tmpsbdvovy5/_tmp_546906
DEBUG: Select phenotype and covariates using query SELECT sample_id, sample_name, smoking FROM sample LEFT OUTER JOIN filename ON sample.file_id = filename.file_id WHERE smoking IS NOT NULL
INFO: 2504 samples are found
DEBUG: Running query INSERT INTO __asso_tmp SELECT DISTINCT variant.variant_id, 0, refGene.refGene.name2 FROM variant, refGene.__rng_refGene_hg19_chr_txStart_txEnd, refGene.refGene WHERE (variant.bin = refGene.__rng_refGene_hg19_chr_txStart_txEnd.bin AND variant.chr = refGene.__rng_refGene_hg19_chr_txStart_txEnd.chr AND variant.pos >= refGene.__rng_refGene_hg19_chr_txStart_txEnd.start AND variant.pos <= refGene.__rng_refGene_hg19_chr_txStart_txEnd.end ) AND (refGene.refGene.rowid = refGene.__rng_refGene_hg19_chr_txStart_txEnd.range_id);
INFO: Grouping variants by 'refGene.name2', please be patient ...
INFO: 573 groups are found
Process GroupHDFGenerator-3:
Traceback (most recent call last):
File "/home/mleong/anaconda3/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/mleong/anaconda3/lib/python3.6/site-packages/variant_tools-3.0.0.dev0-py3.6-linux-x86_64.egg/variant_tools/association_hdf5.py", line 78, in run
for row in cur.execute(select_group):
vt_sqlite3.OperationalError: unable to open database file
It shows up randomly!?
Fixed by Dr. Ma's patch 7d163e2.
Will keep testing for different number of jobs.
I was trying to run the associate test but the error showed up...
Note: the error only shows up on Linux server, it doesn't show up on my laptop.