broadinstitute / oncotator

Other
67 stars 32 forks source link

Bio.Data.CodonTable.TranslationError #351

Open ycl6 opened 8 years ago

ycl6 commented 8 years ago

oncotator encounters an error when parsing a VCF file generated with the GATK pipeline.

oncotator v1.9.0.0

2016-06-03 15:35:58,631 INFO [oncotator.utils.SampleNameSelector:90] Sample name is in the sample_name column.
2016-06-03 15:36:19,675 INFO [oncotator.output.OutputDataManager:194] Wrote 1000 mutations to tsv.
2016-06-03 15:36:34,628 INFO [oncotator.output.OutputDataManager:194] Wrote 2000 mutations to tsv.
2016-06-03 15:36:50,829 INFO [oncotator.output.OutputDataManager:194] Wrote 3000 mutations to tsv.
2016-06-03 15:37:06,749 INFO [oncotator.output.OutputDataManager:194] Wrote 4000 mutations to tsv.
Traceback (most recent call last):
  File "/home/ycl6/.local/bin/oncotator", line 9, in <module>
    load_entry_point('Oncotator==v1.9.0.0', 'console_scripts', 'oncotator')()
  File "build/bdist.linux-x86_64/egg/oncotator/Oncotator.py", line 309, in main
  File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 437, in annotate
  File "build/bdist.linux-x86_64/egg/oncotator/output/VcfOutputRenderer.py", line 119, in renderMutations
  File "build/bdist.linux-x86_64/egg/oncotator/output/OutputDataManager.py", line 92, in __init__
  File "build/bdist.linux-x86_64/egg/oncotator/output/OutputDataManager.py", line 160, in _writeMuts2Tsv
  File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 448, in _applyManualAnnotations
  File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 456, in _applyDefaultAnnotations
  File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 519, in _annotate_mutations_using_datasources
  File "build/bdist.linux-x86_64/egg/oncotator/Annotator.py", line 88, in _annotate_mut
  File "build/bdist.linux-x86_64/egg/oncotator/datasources/EnsemblTranscriptDatasource.py", line 195, in annotate_mutation
  File "build/bdist.linux-x86_64/egg/oncotator/datasources/EnsemblTranscriptDatasource.py", line 253, in _choose_transcript
  File "build/bdist.linux-x86_64/egg/oncotator/datasources/EnsemblTranscriptDatasource.py", line 414, in _choose_canonical_transcript
  File "build/bdist.linux-x86_64/egg/oncotator/datasources/EnsemblTranscriptDatasource.py", line 276, in _select_best_with_multiple_criteria
  File "build/bdist.linux-x86_64/egg/oncotator/datasources/EnsemblTranscriptDatasource.py", line 258, in _get_best_scores
  File "build/bdist.linux-x86_64/egg/oncotator/datasources/EnsemblTranscriptDatasource.py", line 258, in <dictcomp>
  File "build/bdist.linux-x86_64/egg/oncotator/datasources/EnsemblTranscriptDatasource.py", line 412, in <lambda>
  File "build/bdist.linux-x86_64/egg/oncotator/datasources/EnsemblTranscriptDatasource.py", line 335, in _calculate_effect_score
  File "build/bdist.linux-x86_64/egg/oncotator/utils/VariantClassifier.py", line 445, in variant_classify
  File "build/bdist.linux-x86_64/egg/oncotator/utils/VariantClassifier.py", line 313, in _determine_vc_for_cds_overlap
  File "build/bdist.linux-x86_64/egg/oncotator/utils/MutUtils.py", line 460, in translate_sequence
  File "/home/ycl6/.local/lib/python2.7/site-packages/biopython-1.66-py2.7-linux-x86_64.egg/Bio/Seq.py", line 2149, in translate
    return _translate_str(sequence, codon_table, stop_symbol, to_stop, cds)
  File "/home/ycl6/.local/lib/python2.7/site-packages/biopython-1.66-py2.7-linux-x86_64.egg/Bio/Seq.py", line 2060, in _translate_str
    "Codon '{0}' is invalid".format(codon))
Bio.Data.CodonTable.TranslationError: Codon '*GG' is invalid
ycl6 commented 8 years ago

I found this to be caused by the presence of spanning deletions noted with the asterisk * under the ALT column. Is there a way around this besides removing all * from the VCF file manually?

LeeTL1220 commented 8 years ago

Unfortunately, there is not. The spanning deletions are not supported in Oncotator (even in 1.9.0.0)

On Fri, Jun 3, 2016 at 4:27 AM, I-Hsuan Lin notifications@github.com wrote:

I found this to be caused by the presence of spanning deletions noted with the asterisk * under the ALT column. Is there a way around this besides removing all * from the VCF file manually?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/oncotator/issues/351#issuecomment-223520723, or mute the thread https://github.com/notifications/unsubscribe/ACDXk40SjRVOBHJq4H-BGH5EUBlJxb5jks5qH-VigaJpZM4ItSGV .

Lee Lichtenstein Broad Institute 75 Ames Street, Room 7003EB Cambridge, MA 02142 617 714 8632