openvax / topiary

Predict mutated T-cell epitopes from sequencing data
Apache License 2.0
27 stars 9 forks source link

Topiary w/ NetMHC Fails #41

Open JPFinnigan opened 8 years ago

JPFinnigan commented 8 years ago

Hey Guys,

I came across what may/may not be a bug in Topiary that specifically pertains to the use of NetMHC, but not other callers that then call NetMHC (NetMHCcons). I ran:

jpf-mbp:test johnfinnigan$ python /Library/Frameworks/Python.framework/Versions/2.7/bin/topiary \
> --vcf ~/Desktop/TEMP/Results/WES/Tumor_B16_F10_0810/ISMMS/VCF/Strelka/results/passed.somatic.indels.vcf \
> --mhc-predictor netmhc \
> --mhc-alleles H-2-Kb,H-2-Db \
> --mhc-epitope-lengths 8,9,10,11 \
> --ic50-cutoff 500 \
> --rna-transcript-fpkm-gtf-file ~/Desktop/TEMP/Results/RNA/Tumor_B16.F10/ISMMS/Tumor_B16.F10_0810.127A/GTF/StringTie/HISAT2/Tumor_B16.F10_0810.127A.HISAT2.sorted.gtf \
> --rna-min-transcript-expression 0.1 \
> --output-csv ~/Desktop/TEMP/Results/MTA/Tumor_B16.F10_0810/RNA/Tumor_B16.F10_0810.strelka.passing.indels.vcf_Tumor_B16.F10_0810.127A.HISAT2.sorted.gtf_netmhc.csv

This is what I saw:

Topiary commandline arguments:
Namespace(ic50_cutoff=500.0, json_variant_files=[], maf=[], mhc_alleles='H-2-Kb,H-2-Db', mhc_alleles_file=None, mhc_epitope_lengths=[8, 9, 10, 11], mhc_predictor='netmhc', only_novel_epitopes=False, output_csv='/Users/johnfinnigan/Desktop/TEMP/Results/MTA/Tumor_B16.F10_0810/RNA/Tumor_B16.F10_0810.strelka.passing.indels.vcf_Tumor_B16.F10_0810.127A.HISAT2.sorted.gtf_netmhc.csv', output_html=None, padding_around_mutation=None, percentile_cutoff=None, reference_name=None, rna_gene_fpkm_tracking_file=None, rna_min_gene_expression=0.0, rna_min_transcript_expression=0.1, rna_transcript_fpkm_gtf_file='/Users/johnfinnigan/Desktop/TEMP/Results/RNA/Tumor_B16.F10/ISMMS/Tumor_B16.F10_0810.127A/GTF/StringTie/HISAT2/Tumor_B16.F10_0810.127A.HISAT2.sorted.gtf', rna_transcript_fpkm_tracking_file=None, skip_variant_errors=False, variant=[], vcf=['/Users/johnfinnigan/Desktop/TEMP/Results/WES/Tumor_B16_F10_0810/ISMMS/VCF/Strelka/results/passed.somatic.indels.vcf'], wildtype_ligandome_directory=None)
INFO:root:Building MHC binding prediction type for alleles ['H-2-Kb', 'H-2-Db'] and epitope lengths [8, 9, 10, 11]
WARNING:root:Failed to run netMHC -A
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/bin/topiary", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/Users/johnfinnigan/Desktop/Utilities/Topiary/topiary/scripts/topiary", line 64, in <module>
    main()
  File "/Users/johnfinnigan/Desktop/Utilities/Topiary/topiary/scripts/topiary", line 46, in main
    epitopes = predict_epitopes_from_args(args)
  File "/Users/johnfinnigan/Desktop/Utilities/Topiary/topiary/topiary/predict_epitopes.py", line 275, in predict_epitopes_from_args
    mhc_model = mhc_binding_predictor_from_args(args)
  File "/Users/johnfinnigan/Desktop/Utilities/Topiary/topiary/topiary/commandline_args.py", line 228, in mhc_binding_predictor_from_args
    epitope_lengths=epitope_lengths)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/mhctools/netmhc.py", line 47, in __init__
    process_limit=1)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/mhctools/base_commandline_predictor.py", line 127, in __init__
    self.supported_alleles_flag)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/mhctools/base_commandline_predictor.py", line 163, in _determine_supported_alleles
    command, supported_allele_flag
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 566, in check_output
    process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 710, in __init__
    errread, errwrite)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1335, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory

I believe this may be due to the fact that NetMHC, but neither NetMHCpan nor NetMHCcons utilizes both a -a and a -A option flag. NetMHC uses the -A flag to print a list of acceptable input alleles.

E.g.

jpf-mbp:test johnfinnigan$ netMHC -A
### Alleles with ANN predictors:
BoLA-D18.4
BoLA-HD6
BoLA-JSP.1
[...]
SLA-10401
SLA-20401
SLA-30401

So, when topiary tries to run NetMHC -A I think it's receiving back a list of MHC-I alleles which it then doesn't know what to do with WARNING:root:Failed to run netMHC -A.

rschenck commented 8 years ago

Did you find a work around for this or figure anything out? the version of netMHC I'm using (4.0) utilizes -a, but is case sensitive and so does not work. The -a option is utilized in netMHC to pass the allele of interest to netMHC.

tavinathanson commented 8 years ago

@JPFinnigan Are you still running into the issue you posted?

@rschenck We just added support for NetMHC 4.0 here, and have a NetMHC wrapper that infers whether the executable is 3.x or 4.0. Topiary then uses that wrapper; does it work for you now?

JPFinnigan commented 8 years ago

@tavinathanson. I believe so, but then again I may be doing something wrong. Here's what I'm seeing:

JPF-MBP:~ johnfinnigan$ python /Library/Frameworks/Python.framework/Versions/2.7/bin/topiary \
> --vcf ~/Desktop/TEMP/Results/WES/Tumor_B16_F10_0810/ISMMS/VCF/MuTect/Tumor_B16_F10_0810.mutect.targets.pass.vcf \
> --mhc-predictor netmhc \
> --mhc-alleles H2-Kb,H2-Db \
> --mhc-epitope-lengths 8,9,10,11 \
> --ic50-cutoff 500 \
> --rna-transcript-fpkm-gtf-file ~/Desktop/TEMP/Results/RNA/Tumor_B16.F10/ISMMS/Tumor_B16.F10_0810.131B/GTF/StringTie/HISAT2/Tumor_B16.F10_0810.131B.HISAT2.sorted.gtf \
> --rna-min-transcript-expression 0.1 \
> --output-csv ~/Desktop/test.csv

Results in :

Topiary commandline arguments:
Namespace(ic50_cutoff=500.0, json_variant_files=[], maf=[], mhc_alleles='H2-Kb,H2-Db', mhc_alleles_file=None, mhc_epitope_lengths=[8, 9, 10, 11], mhc_predictor='netmhc', only_novel_epitopes=False, output_csv='/Users/johnfinnigan/Desktop/test.csv', output_html=None, padding_around_mutation=None, percentile_cutoff=None, reference_name=None, rna_gene_fpkm_tracking_file=None, rna_min_gene_expression=0.0, rna_min_transcript_expression=0.1, rna_transcript_fpkm_gtf_file='/Users/johnfinnigan/Desktop/TEMP/Results/RNA/Tumor_B16.F10/ISMMS/Tumor_B16.F10_0810.131B/GTF/StringTie/HISAT2/Tumor_B16.F10_0810.131B.HISAT2.sorted.gtf', rna_transcript_fpkm_tracking_file=None, skip_variant_errors=False, variant=[], vcf=['/Users/johnfinnigan/Desktop/TEMP/Results/WES/Tumor_B16_F10_0810/ISMMS/VCF/MuTect/Tumor_B16_F10_0810.mutect.targets.pass.vcf'], wildtype_ligandome_directory=None)
INFO:root:Building MHC binding prediction function for alleles ['H-2-Kb', 'H-2-Db'] and epitope lengths [8, 9, 10, 11]
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/bin/topiary", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/Users/johnfinnigan/Desktop/Utilities/Topiary/topiary/scripts/topiary", line 64, in <module>
    main()
  File "/Users/johnfinnigan/Desktop/Utilities/Topiary/topiary/scripts/topiary", line 46, in main
    epitopes = predict_epitopes_from_args(args)
  File "/Users/johnfinnigan/Desktop/Utilities/Topiary/topiary/topiary/predict_epitopes.py", line 275, in predict_epitopes_from_args
    mhc_model = mhc_binding_predictor_from_args(args)
  File "/Users/johnfinnigan/Desktop/Utilities/Topiary/topiary/topiary/commandline_args.py", line 228, in mhc_binding_predictor_from_args
    epitope_lengths=epitope_lengths)
  File "/Users/johnfinnigan/Desktop/Utilities/Topiary/mhctools/mhctools/netmhc.py", line 47, in NetMHC
    % program_name)
SystemError: Command netMHC is not a valid way of calling any NetMHC software.

I can however call NetMHC via: 'netmhc.,' 'netMHC,' NetMHC,' or Netmhc.' Representative example:

JPF-MBP:~ johnfinnigan$ netMHC -h
Usage: netMHC.py [options] file

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -a STR, --mhc=STR     Allele names ( ',' -separated)
  -l NUM, --peplen=NUM  Length of subpeptides to predict
  -x STR, --xls=STR     Name of tab separated output file
  -s, --sort            Sort output on descending affinity
  -p, --peptide         infile is in peptide format
  -n, --nodirect        Do not use direct prediction (use 9mer aproximation)
  -b, --noblacklist     Do not use blacklist
  -A, --Alleles         Show available alleles and exit
tavinathanson commented 8 years ago

@JPFinnigan What does netMHC --version give you? And can you call netMHC from anywhere (i.e. is it on your PATH)?

JPFinnigan commented 8 years ago

@tavinathanson. I've added all of the predictors topiary uses to my path. I'm also working w/ NetMHC 3.4 (afaik v 4.0 is not available as a stand-along predictor).

JPF-MBP:~ johnfinnigan$ netMHC --version
3.4

HTH

tavinathanson commented 8 years ago

@JPFinnigan I'm having trouble reproducing that error; what versions of mhctools and topiary are you using?

JPFinnigan commented 8 years ago

Using, 0.2.1. Maybe I'm doing something incorrectly. But, Topiary runs just fine w/ netmhcpan, netmhccons, netmhciipan.

tavinathanson commented 8 years ago

@JPFinnigan What version of topiary are you using? What does your directory path to netMHC look like? And what's the output of stat -f "%OLp" <path_to_executable>/netMHC?

JPFinnigan commented 8 years ago

1) Topiary v. 0.2.1.

2)

jpf-mbp:~ johnfinnigan$ echo $PATH
:/Users/johnfinnigan/Desktop/Utilities/NetMHC/netMHC-3.4:

3)

jpf-mbp:~ johnfinnigan$ stat -f "%OLp" /Users/johnfinnigan/Desktop/Utilities/NetMHC/netMHC-3.4/netMHC
755
iskandr commented 8 years ago

@tavinathanson Did our version guessing with mhctools.NetMHC fix this?

tavinathanson commented 8 years ago

@iskandr No, this is an issue that @JPFinnigan had before I mucked with this round of changes and that he still has after the fixes. I haven't yet figured out what the issue is (i.e. why he gets Failed to run netMHC -A when that works for him on the command line).

iskandr commented 7 years ago

@tavinathanson I'm guessing this is fixed now with the newer version sniffing in MHCtools. What do you think?

tavinathanson commented 7 years ago

@iskandr not sure, but this is old enough that we should probably close as not reproducible until it pops up again.