brentp / geneimpacts

prioritize effects of variant annotations from VEP, SnpEff, et al.
MIT License
32 stars 15 forks source link

gene_fusion added as new impact from snpEff #10

Closed anastazie closed 7 years ago

anastazie commented 7 years ago

Hi, thanks for the tool! Here is one impact addition 'gene_fusion'. It has caused KeyError during gemimi database creation using bcbio-nextgen.

Cheers, Nastia

brentp commented 7 years ago

thanks for the PR.

can you give the full traceback that you saw? and if you have an example variant or vcf, that'd be great. I want to make this just a prominent error message rather than an exception.

anastazie commented 7 years ago

Hi, sure, below is part with error. I was running bcbio-nextgen on multiple samples and only third ended up with an error.

[2016-08-14T15:37Z] tabix index 1266-16-freebayes-decompose.vcf.gz
[2016-08-14T15:37Z] snpEff effects : 1266-16
[2016-08-14T15:38Z] tabix index 1266-16-freebayes-decompose-effects.vcf.gz
[2016-08-14T15:38Z] Create gemini database for /stgdata/bcbio/output/FirstRun-merged/work/gemini/1266-16-freebayes-decompose-effects.vcf.gz : 1266-16
[2016-08-14T15:39Z] Traceback (most recent call last):
[2016-08-14T15:39Z]   File "/usr/local/bin/gemini", line 6, in <module>
[2016-08-14T15:39Z]     gemini.gemini_main.main()
[2016-08-14T15:39Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1227, in main
[2016-08-14T15:39Z]     args.func(parser, args)
[2016-08-14T15:39Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 298, in loadchunk_fn
[2016-08-14T15:39Z]     gemini_load_chunk.load(parser, args)
[2016-08-14T15:39Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 843, in load
[2016-08-14T15:39Z]     gemini_loader.populate_from_vcf()
[2016-08-14T15:39Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 201, in populate_from_vcf
[2016-08-14T15:39Z]     (variant, variant_impacts, extra_fields) = self._prepare_variation(var, anno_keys)
[2016-08-14T15:39Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 525, in _prepare_variation
[2016-08-14T15:39Z]     impact_so=impact.so, impact_severity=impact.effect_severity,
[2016-08-14T15:39Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 330, in effect_severity
[2016-08-14T15:39Z]     return self.impact_severity
[2016-08-14T15:39Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 364, in impact_severity
[2016-08-14T15:39Z]     return ['xxx', 'LOW', 'MED', 'HIGH'][self.severity]
[2016-08-14T15:39Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 355, in severity
[2016-08-14T15:39Z]     v = max(lookup[sev[csq]] for csq in self.consequences)
[2016-08-14T15:39Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 355, in <genexpr>
[2016-08-14T15:39Z]     v = max(lookup[sev[csq]] for csq in self.consequences)
[2016-08-14T15:39Z] KeyError: 'gene_fusion'
[2016-08-14T15:39Z] pid 71342: 9999 variants processed.
[2016-08-14T15:40Z] pid 71283: 9999 variants processed.
[2016-08-14T15:40Z] pid 71292: 9999 variants processed.
[2016-08-14T15:40Z] pid 71289: 9999 variants processed.
[2016-08-14T15:41Z] pid 71342: 17592 variants processed.
[2016-08-14T15:41Z] pid 71342: 124 skipped due to having the FILTER field set.
[2016-08-14T15:41Z] pid 71283: 17646 variants processed.
[2016-08-14T15:41Z] pid 71283: 66 skipped due to having the FILTER field set.
[2016-08-14T15:41Z] pid 71292: 17642 variants processed.
[2016-08-14T15:41Z] pid 71292: 70 skipped due to having the FILTER field set.
[2016-08-14T15:41Z] pid 71289: 17644 variants processed.
[2016-08-14T15:41Z] pid 71289: 68 skipped due to having the FILTER field set.
[2016-08-14T15:41Z] Indexing /stgdata/bcbio/output/FirstRun-merged/work/gemini/1266-16-freebayes-decompose-effects.vcf.gz with grabix.
[2016-08-14T15:41Z] Loading 88564 variants.
[2016-08-14T15:41Z] Breaking /stgdata/bcbio/output/FirstRun-merged/work/gemini/1266-16-freebayes-decompose-effects.vcf.gz into 5 chunks.
[2016-08-14T15:41Z] Loading chunk 0.
[2016-08-14T15:41Z] Loading chunk 1.
[2016-08-14T15:41Z] Loading chunk 2.
[2016-08-14T15:41Z] Loading chunk 3.
[2016-08-14T15:41Z] Loading chunk 4.
[2016-08-14T15:41Z] Traceback (most recent call last):
[2016-08-14T15:41Z]   File "/usr/local/share/bcbio/anaconda/bin/gemini", line 6, in <module>
[2016-08-14T15:41Z]     gemini.gemini_main.main()
[2016-08-14T15:41Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1227, in main
[2016-08-14T15:41Z]     args.func(parser, args)
[2016-08-14T15:41Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 198, in load_fn
[2016-08-14T15:41Z]     gemini_load.load(parser, args)
[2016-08-14T15:41Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 48, in load
[2016-08-14T15:41Z]     load_multicore(args)
[2016-08-14T15:41Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 92, in load_multicore
[2016-08-14T15:41Z]     chunks = load_chunks_multicore(grabix_file, args)
[2016-08-14T15:41Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 258, in load_chunks_multicore
[2016-08-14T15:41Z]     wait_until_finished(procs)
[2016-08-14T15:41Z]   File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 349, in wait_until_finished
[2016-08-14T15:41Z]     raise ValueError("Processing failed on GEMINI chunk load")
[2016-08-14T15:41Z] ValueError: Processing failed on GEMINI chunk load
[2016-08-14T15:41Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run
    _do_run(cmd, checks, log_stdout)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'set -o pipefail; /usr/local/share/bcbio/anaconda/bin/gemini  load  --passonly --skip-cadd --skip-gerp-bp  -v /stgdata/bcbio/output/FirstRun-merged/work/gemini/1266-16-freebayes-decompose-effects.vcf.gz -t snpEff --cores 5 --tempdir /stgdata/bcbio/output/FirstRun-merged/work/gemini/tx/tmpPKQXIO /stgdata/bcbio/output/FirstRun-merged/work/gemini/tx/tmpPKQXIO/1266-16-freebayes.db
Traceback (most recent call last):
  File "/usr/local/bin/gemini", line 6, in <module>
    gemini.gemini_main.main()
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1227, in main
    args.func(parser, args)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 298, in loadchunk_fn
    gemini_load_chunk.load(parser, args)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 843, in load
    gemini_loader.populate_from_vcf()
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 201, in populate_from_vcf
    (variant, variant_impacts, extra_fields) = self._prepare_variation(var, anno_keys)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 525, in _prepare_variation
    impact_so=impact.so, impact_severity=impact.effect_severity,
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 330, in effect_severity
    return self.impact_severity
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 364, in impact_severity
    return ['xxx', 'LOW', 'MED', 'HIGH'][self.severity]
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 355, in severity
    v = max(lookup[sev[csq]] for csq in self.consequences)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/geneimpacts/effect.py", line 355, in <genexpr>
    v = max(lookup[sev[csq]] for csq in self.consequences)
KeyError: 'gene_fusion'
pid 71342: 9999 variants processed.
pid 71283: 9999 variants processed.
pid 71292: 9999 variants processed.
pid 71289: 9999 variants processed.
pid 71342: 17592 variants processed.
pid 71342: 124 skipped due to having the FILTER field set.
pid 71283: 17646 variants processed.
pid 71283: 66 skipped due to having the FILTER field set.
pid 71292: 17642 variants processed.
pid 71292: 70 skipped due to having the FILTER field set.
pid 71289: 17644 variants processed.
pid 71289: 68 skipped due to having the FILTER field set.
Indexing /stgdata/bcbio/output/FirstRun-merged/work/gemini/1266-16-freebayes-decompose-effects.vcf.gz with grabix.
Loading 88564 variants.
Breaking /stgdata/bcbio/output/FirstRun-merged/work/gemini/1266-16-freebayes-decompose-effects.vcf.gz into 5 chunks.
Loading chunk 0.
Loading chunk 1.
Loading chunk 2.
Loading chunk 3.
Loading chunk 4.
Traceback (most recent call last):
  File "/usr/local/share/bcbio/anaconda/bin/gemini", line 6, in <module>
    gemini.gemini_main.main()
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1227, in main
    args.func(parser, args)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 198, in load_fn
    gemini_load.load(parser, args)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 48, in load
    load_multicore(args)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 92, in load_multicore
    chunks = load_chunks_multicore(grabix_file, args)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 258, in load_chunks_multicore
    wait_until_finished(procs)
  File "/usr/local/share/bcbio/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 349, in wait_until_finished
    raise ValueError("Processing failed on GEMINI chunk load")
ValueError: Processing failed on GEMINI chunk load
' returned non-zero exit status 1
brentp commented 7 years ago

thank you.