openvax / topiary

Predict mutated T-cell epitopes from sequencing data
Apache License 2.0
27 stars 9 forks source link

--skip-variant-errors doesn't always work #10

Open kippakers opened 9 years ago

kippakers commented 9 years ago

I'm getting a barf-out when my VCF has an N in the ALT allele:

Command:

topiary --mhc-cons --vcf ./test.vcf  --output-csv 12n_test.csv --mhc-alleles B46:01 --ic50-cutoff 500 --mhc-epitope-lengths 9 --skip-variant-errors

VCF line

1   2524333 COSM425966  T   TN  .   .   GENE=MMEL1;STRAND=-;GENE=MMEL1;STRAND=-;CDS=c.1912_1913insG;AA=p.Q638fs*34;CNT=1

Output

Topiary commandline arguments:
Namespace(ic50_cutoff=500.0, keep_wildtype_epitopes=False, maf=[], mhc_alleles='B46:01', mhc_alleles_file=None, mhc_cons=True, mhc_cons_iedb=False, mhc_epitope_lengths=[9], mhc_pan=False, mhc_pan_iedb=False, mhc_random=False, mhc_smm=False, mhc_smm_iedb=False, mhc_smm_pmbec=False, mhc_smm_pmbec_iedb=False, output_csv='12n_test.csv', output_html=None, padding_around_mutation=0, percentile_cutoff=None, rna_gene_fpkm_file=None, rna_min_gene_expression=0.0, rna_min_transcript_expression=0.0, rna_remap_novel_genes_onto_ensembl=False, rna_transcript_fpkm_file=None, self_filter_directory=None, skip_variant_errors=True, vcf=['./test.vcf'])
INFO:root:Building MHC binding prediction type for alleles ['HLA-B*46:01'] and epitope lengths [9]
INFO:root:netMHCcons finished with return code 0
INFO:root:netMHCcons took 0.0284 seconds
Traceback (most recent call last):
  File "/hpc/users/akersn01/.local/bin/topiary", line 5, in <module>
    pkg_resources.run_script('topiary==0.0.6', 'topiary')
  File "/hpc/packages/minerva-common/py_packages/2.7/lib/python2.7/site-packages/distribute-0.6.10-py2.7.egg/pkg_resources.py", line 461, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/hpc/packages/minerva-common/py_packages/2.7/lib/python2.7/site-packages/distribute-0.6.10-py2.7.egg/pkg_resources.py", line 1194, in run_script
    execfile(script_filename, namespace, namespace)
  File "/hpc/users/akersn01/.local/lib/python2.7/site-packages/topiary-0.0.6-py2.7.egg/EGG-INFO/scripts/topiary", line 99, in <module>
    main()
  File "/hpc/users/akersn01/.local/lib/python2.7/site-packages/topiary-0.0.6-py2.7.egg/EGG-INFO/scripts/topiary", line 59, in main
    variants = variant_collection_from_args(args)
  File "/hpc/users/akersn01/.local/lib/python2.7/site-packages/topiary-0.0.6-py2.7.egg/topiary/args.py", line 59, in variant_collection_from_args
    variant_collections.append(varcode.load_vcf(vcf_path))
  File "/hpc/users/akersn01/.local/lib/python2.7/site-packages/varcode/vcf.py", line 95, in load_vcf
    ensembl=ensembl)
  File "/hpc/users/akersn01/.local/lib/python2.7/site-packages/varcode/variant.py", line 106, in __init__
    allow_extended_nucleotides=allow_extended_nucleotides)
  File "/hpc/users/akersn01/.local/lib/python2.7/site-packages/varcode/nucleotides.py", line 91, in normalize_nucleotide_string
    "Invalid character in nucleotide string: %s" % letter)
ValueError: Invalid character in nucleotide string: N
kippakers commented 9 years ago

Similarly, I get this error leading to a crash and burn even with --skip-variant-errors :

netMHC-3.3 produced an error

ERROR: Illegal character 'U' in sequence Entry_8 

I've found there are a decent number of Us and Bs in the reference which cause netMHC to hopelessly die.