griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
137 stars 59 forks source link

Crash when all HLA allels are incomptabile with prediction algorithms #356

Closed haraldgrove closed 5 years ago

haraldgrove commented 5 years ago

Describe the bug After running prediction of class I and class II the program crashes when trying to filter the combined result file. The main error message is "Expected 39 fields in line 4502, saw 41". The class I output file has 39 columns and 4501 lines, the class II file has 41 columns. In the same run, all class I HLA alleles were incompatible with four of the six algorithms we specified. The corresponding columns for these four algorithms were also absent from the class I results file.

To Reproduce Run a class I & class II prediction on any vcf file with all class I HLA alleles not valid for all of MHCflurry, NetMHC, SMM and SMMPMBEC.

docker run --rm -v $PWD:/output_dir griffithlab/pvactools \
        pvacseq run /output_dir/sample1_m2_vep_filtered.vcf \
        --iedb-install-directory /opt/iedb \
        -t 4 \
        sample1 \
        HLA-A*11:50Q,HLA-A*33:03,HLA-B*55:02,HLA-B*56:09,HLA-C*01:02,HLA-C*03:02,DQA1*05:01,DQA1*06:01,DQB1*02:01,DQB1*03:01,DRB1*03:01,DRB1*12:02 \
        MHCflurry MHCnuggetsI MHCnuggetsII NNalign NetMHC PickPocket SMM SMMPMBEC SMMalign \
        /output_dir/ -e 8,9,10

Runtime log:

Done: Pipeline finished successfully. File /output_dir/MHC_Class_II/sample1.filtered.condensed.ranked.tsv contains list of filtered putative neoantigens.

/opt/conda/lib/python3.6/site-packages/lib/output_parser.py:485: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please 
read https://msg.pyyaml.org/load for full details.
  protein_identifiers_from_label = yaml.load(key_file_reader)
Creating combined reports
Running Binding Filters
Traceback (most recent call last):
  File "/opt/conda/bin/pvacseq", line 11, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.6/site-packages/tools/pvacseq/main.py", line 99, in main
    args[0].func.main(args[1])
  File "/opt/conda/lib/python3.6/site-packages/tools/pvacseq/run.py", line 208, in main
    create_combined_reports(base_output_dir, args, additional_input_files)
  File "/opt/conda/lib/python3.6/site-packages/tools/pvacseq/run.py", line 64, in create_combined_reports
    PostProcessor(**post_processing_params).execute()
  File "/opt/conda/lib/python3.6/site-packages/lib/post_processor.py", line 26, in execute
    self.execute_binding_filter()
  File "/opt/conda/lib/python3.6/site-packages/lib/post_processor.py", line 47, in execute_binding_filter
    self.allele_specific_binding_thresholds,
  File "/opt/conda/lib/python3.6/site-packages/lib/binding_filter.py", line 37, in execute
    Filter(self.input_file, self.output_file, filter_criteria, self.exclude_nas).execute()
  File "/opt/conda/lib/python3.6/site-packages/lib/filter.py", line 14, in execute
    data = pd.read_csv(self.input_file, delimiter='\t', float_precision='high', low_memory=False)
  File "/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py", line 435, in _read
    data = parser.read(nrows)
  File "/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py", line 1139, in read
    ret = self._engine.read(nrows)
  File "/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py", line 1995, in read
    data = self._reader.read(nrows)
  File "pandas/_libs/parsers.pyx", line 902, in pandas._libs.parsers.TextReader.read
  File "pandas/_libs/parsers.pyx", line 983, in pandas._libs.parsers.TextReader._read_rows
  File "pandas/_libs/parsers.pyx", line 2172, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 39 fields in line 4502, saw 41

Expected behavior The output of the two files, sample1.filtered.tsv and sample1.condensed.filtered.tsv.

susannasiebert commented 5 years ago

This bug was fixed in the latest version (1.3.7). I'm closing this issue for now but please reopen if you still encounter this problem using the latest release.