bcbio / bcbio-nextgen

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
https://bcbio-nextgen.readthedocs.io
MIT License
986 stars 354 forks source link

gemini error #1533

Closed MengNiu closed 8 years ago

MengNiu commented 8 years ago

Hi,

I updated the gemini as I saw in the related posts, but still getting the error as following:

Thanks for the help!


Traceback (most recent call last): File "/home/mn/local/share/bcbio/anaconda/bin/gemini", line 6, in gemini.gemini_main.main() File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1227, in main args.func(parser, args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 198, in load_fn gemini_load.load(parser, args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 48, in load load_multicore(args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 92, in load_multicore chunks = load_chunks_multicore(grabix_file, args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 258, in load_chunks_multicore wait_until_finished(procs) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 349, in wait_until_finished raise ValueError("Processing failed on GEMINI chunk load") ValueError: Processing failed on GEMINI chunk load Uncaught exception occurred Traceback (most recent call last): File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run _do_run(cmd, checks, log_stdout) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run raise subprocess.CalledProcessError(exitcode, error_msg)

CalledProcessError: Command 'set -o pipefail; /home/mn/local/share/bcbio/anaconda/bin/gemini load --passonly --skip-gerp-bp -v /home/mn/test/work/gemini/test11-varscan-decompose-effects.vcf.gz

Traceback (most recent call last): File "/home/mn/local/share/bcbio/anaconda/bin/gemini", line 6, in gemini.gemini_main.main() File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1227, in main args.func(parser, args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 198, in load_fn gemini_load.load(parser, args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 48, in load load_multicore(args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 92, in load_multicore chunks = load_chunks_multicore(grabix_file, args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 258, in load_chunks_multicore wait_until_finished(procs) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 349, in wait_until_finished raise ValueError("Processing failed on GEMINI chunk load") ValueError: Processing failed on GEMINI chunk load ' returned non-zero exit status 1

chapmanb commented 8 years ago

Thanks for the report and sorry about the issues. It looks like there is not enough information in the traceback you posted to diagnose the underlying issue. GEMINI is failing for some reason but the causative error must be higher up in the log file. You can try diagnosing by running the failing command:

 /home/mn/local/share/bcbio/anaconda/bin/gemini load --passonly --skip-gerp-bp -v /home/mn/test/work/gemini/test11-varscan-decompose-effects.vcf.gz test.db

This will hopefully provide some details around why GEMINI failed to help with debugging.

MengNiu commented 8 years ago

Hi Brad,

Thanks for the reply! The output of the command listed is:


CADD scores are being loaded (to skip use:--skip-cadd). pid 55207: 4462 variants processed. pid 55207: 126902 skipped due to having the FILTER field set. storing version, header, etc. storing gene-detailed storing gene-summary updating gene-table building indices


The very first error in bcbio-nextgen.log is (it was shown on the job on node3(ica3)):

[2016-08-26T15:40Z] ica3: Uncaught exception occurred Traceback (most recent call last): File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run _do_run(cmd, checks, log_stdout) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run raise subprocess.CalledProcessError(exitcode, error_msg) CalledProcessError: Command 'set -o pipefail; /home/mn/local/share/bcbio/anaconda/bin/gemini load --passonly --skip-gerp-bp -v /home/mn/Ccw/test/work/gemini/5961-ensemble-decompose-effects.vcf.gz -t snpEff --cores 16 --tempdir /home/mn/Ccw/test/work/gemini/tx/tmphiAnup /home/mn/Ccw/test/work/gemini/tx/tmphiAnup/5961-ensemble.db File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 201, in populate_from_vcf (variant, variant_impacts, extra_fields) = self._prepare_variation(var, anno_keys) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 489, in _prepare_variation gt_bases = var.gt_bases File "cyvcf2/cyvcf2.pyx", line 725, in cyvcf2.cyvcf2.Variant.gt_bases.get (cyvcf2/cyvcf2.c:19337) IndexError: list index out of range Traceback (most recent call last): File "/home/mn/local/bin/gemini", line 6, in gemini.gemini_main.main() File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1227, in main args.func(parser, args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 298, in loadchunk_fn gemini_load_chunk.load(parser, args) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 843, in load gemini_loader.populate_from_vcf() File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 201, in populate_from_vcf (variant, variant_impacts, extra_fields) = self._prepare_variation(var, anno_keys) File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 489, in _prepare_variation gt_bases = var.gt_bases File "cyvcf2/cyvcf2.pyx", line 725, in cyvcf2.cyvcf2.Variant.gt_bases.get (cyvcf2/cyvcf2.c:19337) IndexError: list index out of range

The first error on the test1.log(where test1.slurm is the slurm file), it showed the job was on node2(ica2):

[2016-08-26T16:21Z] ica2: File "/home/mn/local/bin/gemini", line 6, in [2016-08-26T16:21Z] ica2: gemini.gemini_main.main() [2016-08-26T16:21Z] ica2: File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1227, in main [2016-08-26T16:21Z] ica2: args.func(parser, args) [2016-08-26T16:21Z] ica2: File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 298, in loadchunk_fn [2016-08-26T16:21Z] ica2: gemini_load_chunk.load(parser, args) [2016-08-26T16:21Z] ica2: File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 843, in load [2016-08-26T16:21Z] ica2: gemini_loader.populate_from_vcf() [2016-08-26T16:21Z] ica2: File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 201, in populate_from_vcf [2016-08-26T16:21Z] ica2: (variant, variant_impacts, extra_fields) = self._prepare_variation(var, anno_keys) [2016-08-26T16:21Z] ica2: File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 525, in _prepare_variation [2016-08-26T16:21Z] ica2: impact_so=impact.so, impact_severity=impact.effect_severity, [2016-08-26T16:21Z] ica2: File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 331, in effect_severity [2016-08-26T16:21Z] ica2: return self.impact_severity [2016-08-26T16:21Z] ica2: File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 365, in impact_severity [2016-08-26T16:21Z] ica2: return ['xxx', 'LOW', 'MED', 'HIGH'][self.severity] [2016-08-26T16:21Z] ica2: File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 356, in severity [2016-08-26T16:21Z] ica2: v = max(lookup[sev[csq]] for csq in self.consequences) [2016-08-26T16:21Z] ica2: File "/home/mn/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 356, in [2016-08-26T16:21Z] ica2: v = max(lookup[sev[csq]] for csq in self.consequences) [2016-08-26T16:21Z] ica2: KeyError: 'gene_fusion'

chapmanb commented 8 years ago

Thanks for the details. To fix these, could you try doing:

bcbio_conda install -c bioconda geneimpacts cyvcf2

That should get you geneimpacts 0.1.4_1 and cyvcf2 0.5.3_0. I just pushed a fix for the geneimpacts problem and the cyvcf2 problem was fixed a while back (https://github.com/brentp/cyvcf2/pull/14) but you might still have an old version.

After this you should be able to re-start your analysis from where it finished and hopefully things will finish successfully.

MengNiu commented 8 years ago

Hi Brad,

Thanks for the fix, it finished successfully!

Meng