Describe the bug
After running prediction of class I and class II the program crashes when trying to filter the combined result file. The main error message is "Expected 39 fields in line 4502, saw 41". The class I output file has 39 columns and 4501 lines, the class II file has 41 columns. In the same run, all class I HLA alleles were incompatible with four of the six algorithms we specified. The corresponding columns for these four algorithms were also absent from the class I results file.
To Reproduce
Run a class I & class II prediction on any vcf file with all class I HLA alleles not valid for all of MHCflurry, NetMHC, SMM and SMMPMBEC.
Done: Pipeline finished successfully. File /output_dir/MHC_Class_II/sample1.filtered.condensed.ranked.tsv contains list of filtered putative neoantigens.
/opt/conda/lib/python3.6/site-packages/lib/output_parser.py:485: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please
read https://msg.pyyaml.org/load for full details.
protein_identifiers_from_label = yaml.load(key_file_reader)
Creating combined reports
Running Binding Filters
Traceback (most recent call last):
File "/opt/conda/bin/pvacseq", line 11, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.6/site-packages/tools/pvacseq/main.py", line 99, in main
args[0].func.main(args[1])
File "/opt/conda/lib/python3.6/site-packages/tools/pvacseq/run.py", line 208, in main
create_combined_reports(base_output_dir, args, additional_input_files)
File "/opt/conda/lib/python3.6/site-packages/tools/pvacseq/run.py", line 64, in create_combined_reports
PostProcessor(**post_processing_params).execute()
File "/opt/conda/lib/python3.6/site-packages/lib/post_processor.py", line 26, in execute
self.execute_binding_filter()
File "/opt/conda/lib/python3.6/site-packages/lib/post_processor.py", line 47, in execute_binding_filter
self.allele_specific_binding_thresholds,
File "/opt/conda/lib/python3.6/site-packages/lib/binding_filter.py", line 37, in execute
Filter(self.input_file, self.output_file, filter_criteria, self.exclude_nas).execute()
File "/opt/conda/lib/python3.6/site-packages/lib/filter.py", line 14, in execute
data = pd.read_csv(self.input_file, delimiter='\t', float_precision='high', low_memory=False)
File "/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py", line 435, in _read
data = parser.read(nrows)
File "/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py", line 1139, in read
ret = self._engine.read(nrows)
File "/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py", line 1995, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 902, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 983, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 2172, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 39 fields in line 4502, saw 41
Expected behavior
The output of the two files, sample1.filtered.tsv and sample1.condensed.filtered.tsv.
This bug was fixed in the latest version (1.3.7). I'm closing this issue for now but please reopen if you still encounter this problem using the latest release.
Describe the bug After running prediction of class I and class II the program crashes when trying to filter the combined result file. The main error message is "Expected 39 fields in line 4502, saw 41". The class I output file has 39 columns and 4501 lines, the class II file has 41 columns. In the same run, all class I HLA alleles were incompatible with four of the six algorithms we specified. The corresponding columns for these four algorithms were also absent from the class I results file.
To Reproduce Run a class I & class II prediction on any vcf file with all class I HLA alleles not valid for all of MHCflurry, NetMHC, SMM and SMMPMBEC.
Runtime log:
Expected behavior The output of the two files, sample1.filtered.tsv and sample1.condensed.filtered.tsv.