ValueError: could not convert string to float

Reda94 commented 3 years ago

@susannasiebert @malachig Could you please help me with the below issue? I get it while using MHCflurry.

pvactools version: pvactools version 2.0.3 Docker installation (I pulled the container using singularity)
Python version: 3.7
Operating System: Centos 7 (centos-release-7-6.1810.2.el7.centos.x86_64)

I am getting the following error for some samples:

Traceback (most recent call last):
  File "/usr/local/bin/pvacseq", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/tools/pvacseq/main.py", line 95, in main
    args[0].func.main(args[1])
  File "/usr/local/lib/python3.7/site-packages/tools/pvacseq/run.py", line 122, in main
    pipeline.execute()
  File "/usr/local/lib/python3.7/site-packages/lib/pipeline.py", line 476, in execute
    split_parsed_output_files = self.parse_outputs(chunks)
  File "/usr/local/lib/python3.7/site-packages/lib/pipeline.py", line 429, in parse_outputs
    parser.execute()
  File "/usr/local/lib/python3.7/site-packages/lib/output_parser.py", line 436, in execute
    iedb_results = self.process_input_iedb_file(tsv_entries)
  File "/usr/local/lib/python3.7/site-packages/lib/output_parser.py", line 360, in process_input_iedb_file
    iedb_results = self.parse_iedb_file(tsv_entries)
  File "/usr/local/lib/python3.7/site-packages/lib/output_parser.py", line 576, in parse_iedb_file
    percentile     = self.get_percentile(line)
  File "/usr/local/lib/python3.7/site-packages/lib/output_parser.py", line 78, in get_percentile
    return float(percentile)
ValueError: could not convert string to float:

This is the command I ran (using singularity):

singularity exec -B ${output_folder}/${smpl_name}:/home/${smpl_name} /working/Reda/pVACtools_installation/pvactools_latest.sif \
pvacseq run \
--iedb-install-directory /opt/iedb \
--keep-tmp-files \
--net-chop-method cterm \
--netmhc-stab \
--run-reference-proteome-similarity \
--additional-report-columns sample_name \
--minimum-fold-change 1 \
--allele-specific-binding-thresholds \
/home/${smpl_name}/${smpl_name}.genotyped.vep.vcf \
${smpl_name} \
${haplotyes} \
${tls} \
/home/${smpl_name}

Full output:

WARNING: passwd file doesn't exist in container, not updating
WARNING: group file doesn't exist in container, not updating
Executing MHC Class I predictions
Converting .vcf to TSV
Completed
Converting VCF to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting TSV into smaller chunks
Splitting TSV into smaller chunks - Entries 1-100
Splitting TSV into smaller chunks - Entries 101-172
Completed
Generating Variant Peptide FASTA and Key Files
Generating Variant Peptide FASTA and Key Files - Epitope Length 8 - Entries 1-200
Generating Variant Peptide FASTA and Key Files - Epitope Length 9 - Entries 1-200
Generating Variant Peptide FASTA and Key Files - Epitope Length 10 - Entries 1-200
Generating Variant Peptide FASTA and Key Files - Epitope Length 11 - Entries 1-200
Generating Variant Peptide FASTA and Key Files - Epitope Length 8 - Entries 201-344
Generating Variant Peptide FASTA and Key Files - Epitope Length 9 - Entries 201-344
Generating Variant Peptide FASTA and Key Files - Epitope Length 10 - Entries 201-344
Generating Variant Peptide FASTA and Key Files - Epitope Length 11 - Entries 201-344
Completed
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.8.tsv_1-200
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.8.tsv_1-200 - Completed
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.9.tsv_1-200
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.9.tsv_1-200 - Completed
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.10.tsv_1-200
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.10.tsv_1-200 - Completed
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.11.tsv_1-200
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.11.tsv_1-200 - Completed
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.8.tsv_1-200
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.8.tsv_1-200 - Completed
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.9.tsv_1-200
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.9.tsv_1-200 - Completed
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.10.tsv_1-200
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.10.tsv_1-200 - Completed
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.11.tsv_1-200
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.11.tsv_1-200 - Completed
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.8.tsv_1-200
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.8.tsv_1-200 - Completed
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.9.tsv_1-200
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.9.tsv_1-200 - Completed
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.10.tsv_1-200
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.10.tsv_1-200 - Completed
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.11.tsv_1-200
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.11.tsv_1-200 - Completed
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.8.tsv_1-200
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.8.tsv_1-200 - Completed
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.9.tsv_1-200
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.9.tsv_1-200 - Completed
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.10.tsv_1-200
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.10.tsv_1-200 - Completed
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.11.tsv_1-200
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.11.tsv_1-200 - Completed
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.8.tsv_1-200
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.8.tsv_1-200 - Completed
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.9.tsv_1-200
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.9.tsv_1-200 - Completed
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.10.tsv_1-200
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.10.tsv_1-200 - Completed
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.11.tsv_1-200
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.11.tsv_1-200 - Completed
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.8.tsv_201-344
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.8.tsv_201-344 - Completed
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.9.tsv_201-344
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.9.tsv_201-344 - Completed
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.10.tsv_201-344
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.10.tsv_201-344 - Completed
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.11.tsv_201-344
Making binding predictions on Allele HLA-A*24:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-A*24:02.11.tsv_201-344 - Completed
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.8.tsv_201-344
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.8.tsv_201-344 - Completed
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.9.tsv_201-344
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.9.tsv_201-344 - Completed
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.10.tsv_201-344
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.10.tsv_201-344 - Completed
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.11.tsv_201-344
Making binding predictions on Allele HLA-B*35:01 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*35:01.11.tsv_201-344 - Completed
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.8.tsv_201-344
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.8.tsv_201-344 - Completed
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.9.tsv_201-344
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.9.tsv_201-344 - Completed
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.10.tsv_201-344
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.10.tsv_201-344 - Completed
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.11.tsv_201-344
Making binding predictions on Allele HLA-B*52:01 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-B*52:01.11.tsv_201-344 - Completed
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.8.tsv_201-344
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.8.tsv_201-344 - Completed
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.9.tsv_201-344
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.9.tsv_201-344 - Completed
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.10.tsv_201-344
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.10.tsv_201-344 - Completed
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.11.tsv_201-344
Making binding predictions on Allele HLA-C*01:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*01:02.11.tsv_201-344 - Completed
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.8.tsv_201-344
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 8 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.8.tsv_201-344 - Completed
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.9.tsv_201-344
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 9 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.9.tsv_201-344 - Completed
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.10.tsv_201-344
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 10 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.10.tsv_201-344 - Completed
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.11.tsv_201-344
Making binding predictions on Allele HLA-C*12:02 and Epitope Length 11 with Method MHCflurry - File /home/mysample/MHC_Class_I/tmp/mysample.MHCflurry.HLA-C*12:02.11.tsv_201-344 - Completed
Parsing binding predictions for Allele HLA-A*24:02 and Epitope Length 8 - Entries 1-200
Parsing prediction file for Allele HLA-A*24:02 and Epitope Length 8 - Entries 1-200
Parsing prediction file for Allele HLA-A*24:02 and Epitope Length 8 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-A*24:02 and Epitope Length 9 - Entries 1-200
Parsing prediction file for Allele HLA-A*24:02 and Epitope Length 9 - Entries 1-200
Parsing prediction file for Allele HLA-A*24:02 and Epitope Length 9 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-A*24:02 and Epitope Length 10 - Entries 1-200
Parsing prediction file for Allele HLA-A*24:02 and Epitope Length 10 - Entries 1-200
Parsing prediction file for Allele HLA-A*24:02 and Epitope Length 10 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-A*24:02 and Epitope Length 11 - Entries 1-200
Parsing prediction file for Allele HLA-A*24:02 and Epitope Length 11 - Entries 1-200
Parsing prediction file for Allele HLA-A*24:02 and Epitope Length 11 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-B*35:01 and Epitope Length 8 - Entries 1-200
Parsing prediction file for Allele HLA-B*35:01 and Epitope Length 8 - Entries 1-200
Parsing prediction file for Allele HLA-B*35:01 and Epitope Length 8 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-B*35:01 and Epitope Length 9 - Entries 1-200
Parsing prediction file for Allele HLA-B*35:01 and Epitope Length 9 - Entries 1-200
Parsing prediction file for Allele HLA-B*35:01 and Epitope Length 9 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-B*35:01 and Epitope Length 10 - Entries 1-200
Parsing prediction file for Allele HLA-B*35:01 and Epitope Length 10 - Entries 1-200
Parsing prediction file for Allele HLA-B*35:01 and Epitope Length 10 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-B*35:01 and Epitope Length 11 - Entries 1-200
Parsing prediction file for Allele HLA-B*35:01 and Epitope Length 11 - Entries 1-200
Parsing prediction file for Allele HLA-B*35:01 and Epitope Length 11 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-B*52:01 and Epitope Length 8 - Entries 1-200
Parsing prediction file for Allele HLA-B*52:01 and Epitope Length 8 - Entries 1-200
Parsing prediction file for Allele HLA-B*52:01 and Epitope Length 8 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-B*52:01 and Epitope Length 9 - Entries 1-200
Parsing prediction file for Allele HLA-B*52:01 and Epitope Length 9 - Entries 1-200
Parsing prediction file for Allele HLA-B*52:01 and Epitope Length 9 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-B*52:01 and Epitope Length 10 - Entries 1-200
Parsing prediction file for Allele HLA-B*52:01 and Epitope Length 10 - Entries 1-200
Parsing prediction file for Allele HLA-B*52:01 and Epitope Length 10 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-B*52:01 and Epitope Length 11 - Entries 1-200
Parsing prediction file for Allele HLA-B*52:01 and Epitope Length 11 - Entries 1-200
Parsing prediction file for Allele HLA-B*52:01 and Epitope Length 11 - Entries 1-200 - Completed
Parsing binding predictions for Allele HLA-C*01:02 and Epitope Length 8 - Entries 1-200
Parsing prediction file for Allele HLA-C*01:02 and Epitope Length 8 - Entries 1-200
Traceback (most recent call last):
  File "/usr/local/bin/pvacseq", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/tools/pvacseq/main.py", line 95, in main
    args[0].func.main(args[1])
  File "/usr/local/lib/python3.7/site-packages/tools/pvacseq/run.py", line 122, in main
    pipeline.execute()
  File "/usr/local/lib/python3.7/site-packages/lib/pipeline.py", line 476, in execute
    split_parsed_output_files = self.parse_outputs(chunks)
  File "/usr/local/lib/python3.7/site-packages/lib/pipeline.py", line 429, in parse_outputs
    parser.execute()
  File "/usr/local/lib/python3.7/site-packages/lib/output_parser.py", line 436, in execute
    iedb_results = self.process_input_iedb_file(tsv_entries)
  File "/usr/local/lib/python3.7/site-packages/lib/output_parser.py", line 360, in process_input_iedb_file
    iedb_results = self.parse_iedb_file(tsv_entries)
  File "/usr/local/lib/python3.7/site-packages/lib/output_parser.py", line 576, in parse_iedb_file
    percentile     = self.get_percentile(line)
  File "/usr/local/lib/python3.7/site-packages/lib/output_parser.py", line 78, in get_percentile
    return float(percentile)
ValueError: could not convert string to float:

For some other samples, the pipeline finished completely and successfully. Any idea as to why I am getting the error above?

susannasiebert commented 3 years ago

Would you be able to share the output directory for this run with us? This contains the temporary file that is causing this error so it would help us to figure out what the contents of the file are that it's trying to parse.

Reda94 commented 3 years ago

@susannasiebert Thanks for your reply. Please find attached both the vcf and the output directory of a sample that gave the same error (I am not sure this corresponds exactly to the exact same run that gave the above error but I am sure the files are from a run that gave the same error). mysample_pvacseq.zip

susannasiebert commented 3 years ago

It looks like for some of the tmp output files for MHCflurry, the percentile column doesn't contain any values. I haven't encountered this before. I'm not sure what might be causing it. Could you execute the following command and post what it returns for you: mhcflurry-predict --alleles HLA-C*01:02 --peptides GFGPRDAD

Reda94 commented 3 years ago

@susannasiebert I executed the command as you suggested from within the pvactools singularity container and this was the output:

Forcing tensorflow backend.
2021-08-18 20:07:13.764487: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-08-18 20:07:13.800209: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2600000000 Hz
2021-08-18 20:07:13.801002: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fec50000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-08-18 20:07:13.801035: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-08-18 20:07:13.807433: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-08-18 20:07:13.807466: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2021-08-18 20:07:13.807492: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (int000): /proc/driver/nvidia/version does not exist
WARNING:root:No flanking information provided. Specify --no-flanking to silence this warning
Predicting processing.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:02<00:00,  2.98s/it]
Predicting affinities.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:06<00:00,  6.51s/it]
/usr/local/lib/python3.7/site-packages/mhcflurry/class1_affinity_predictor.py:1021: UserWarning: Allele HLA-C*01:02 has no percentile rank information
  warnings.warn(msg)
allele,peptide,mhcflurry_affinity,mhcflurry_affinity_percentile,mhcflurry_processing_score,mhcflurry_presentation_score,mhcflurry_presentation_percentile
HLA-C*01:02,GFGPRDAD,27978.666327944356,,0.0005653205935232108,0.003611682294810309,99.28660326086957

susannasiebert commented 3 years ago

Looks like this is the problem: /usr/local/lib/python3.7/site-packages/mhcflurry/class1_affinity_predictor.py:1021: UserWarning: Allele HLA-C*01:02 has no percentile rank information

I confirmed that this problem also occurs in the docker container. I think something might've gone wrong when I originally created it. I re-created the image from scratch, confirmed that the above command now works, and updated both 2.0.3 and latest to use the new image (new sha256:b2e70954e73cfab5a8e428c87e61861436d785548cfc193b820322afd74080f0). Please recreate your singularity image and test the above command again. If the above returns a percentile rank then I believe you can rerun your pVACseq runs. You will need to run them from scratch again.

Reda94 commented 3 years ago

Hi @susannasiebert, I re-pulled the new image as you suggested and I didn't get the error (the sample completed successfully). However, I a, now getting this error for other samples:

An exception occured in thread 14: (<class 'Exception'>, An error occurred while calling MHCflurry:
2021-08-19 19:19:55.662375: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-08-19 19:19:55.691008: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2599945000 Hz
2021-08-19 19:19:55.693342: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7eff98000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-08-19 19:19:55.693374: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-08-19 19:19:55.707345: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-08-19 19:19:55.707372: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2021-08-19 19:19:55.707395: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ca091): /proc/driver/nvidia/version does not exist
WARNING:root:No flanking information provided. Specify --no-flanking to silence this warning

  0%|          | 0/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:30<00:00, 30.86s/it]
100%|██████████| 1/1 [00:30<00:00, 30.86s/it]

  0%|          | 0/1 [00:00<?, ?it/s]).
Traceback (most recent call last):
  File "/usr/local/bin/pvacseq", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/tools/pvacseq/main.py", line 95, in main
    args[0].func.main(args[1])
  File "/usr/local/lib/python3.7/site-packages/tools/pvacseq/run.py", line 122, in main
    pipeline.execute()
  File "/usr/local/lib/python3.7/site-packages/lib/pipeline.py", line 475, in execute
    self.call_iedb(chunks)
  File "/usr/local/lib/python3.7/site-packages/lib/pipeline.py", line 374, in call_iedb
    p.print("Making binding predictions on Allele %s and Epitope Length %s with Method %s - File %s - Completed" % (a, epl, method, filename))
  File "/usr/local/lib/python3.7/site-packages/pymp/__init__.py", line 148, in __exit__
    raise exc_t(exc_val)
Exception: An error occurred while calling MHCflurry:
2021-08-19 19:19:55.662375: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2021-08-19 19:19:55.691008: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2599945000 Hz
2021-08-19 19:19:55.693342: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7eff98000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-08-19 19:19:55.693374: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-08-19 19:19:55.707345: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-08-19 19:19:55.707372: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (303)
2021-08-19 19:19:55.707395: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ca091): /proc/driver/nvidia/version does not exist
WARNING:root:No flanking information provided. Specify --no-flanking to silence this warning

  0%|          | 0/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:30<00:00, 30.86s/it]
100%|██████████| 1/1 [00:30<00:00, 30.86s/it]

  0%|          | 0/1 [00:00<?, ?it/s]
slurmstepd: error: Detected 83 oom-kill event(s) in step 27430023.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

Any idea what this might be due to?

susannasiebert commented 3 years ago

I think the actual error here is this: slurmstepd: error: Detected 83 oom-kill event(s) in step 27430023.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler which sounds like you ran out of memory. Can you try running on a machine with more memory?

Reda94 commented 3 years ago

@susannasiebert Thanks for the suggestion! Will try this out and let you know what I get. Quick question regarding the variant types considered by pvacseq: are stopgain and/or stoploss mutations (either SNVc or indels) kept out?

susannasiebert commented 3 years ago

Stop gain and stop loss mutations will be processed by pVACseq as long as they are also annotated as a snv/frameshift/inframe indel.

Reda94 commented 3 years ago

@susannasiebert regarding the MHCflurry error above (slurmstepd: error: Detected 83 oom-kill event(s)), it was indeed due to insufficient memory. Thanks for the suggestion!

Regarding the pvacseq parameters, is there a way to disable the binding affinity (ic50) filtering? That would be equivalent to set it to a very high value (infinity) but I am not sure if there is a proper way to disable it, for example, if I want to filter only based on the percentile.

Also, is there a way to run NetMHCstabpan for all epitopes and not only the filtered ones?

Many thanks for your help!

susannasiebert commented 3 years ago

Unfortunately, there is not outright way to disable binding affinity filtering but still filter on the percentile. Like you suggested, setting the binding threshold to a very large number would be the best course of action. Something like 1,000,000 should be large enough.

Right now there is no way to run NetMHCstabpan on the all epitopes file although we will be adding a standalone command in the next major version releases to run just this step which you can then use to run this step on your all epitopes file outside from the prediction runs. If you are interested in testing this out, I can create an alpha release docker container for you that has this command.

Reda94 commented 3 years ago

Hi @susannasiebert , I tried to set the binding affinity threshold to a quite high value but I am now getting an empty .filtered.tsv file for all my samples. This is the command I ran for all my samples:

singularity exec -B /local/Reda/test_mysample:/home/test_mysample /local/Reda/pVACtools_installation_no_MHCflurry_bug/pvactools_latest.sif \
pvacseq run \
--iedb-install-directory /opt/iedb \
--keep-tmp-files \
--n-threads 4 \
--fasta-size 100000 \
--class-i-epitope-length 8,9,10 \
--binding-threshold 10000000 \
--net-chop-method cterm \
--netmhc-stab \
--run-reference-proteome-similarity \
/home/test_mysample/mysample.genotyped.vep.vcf \
mysample \
$(cat /local/Reda/haplotypes_reformatted_for_pvacseq/mysample.txt) \
NetMHCpan \
/home/test_mysample

susannasiebert commented 3 years ago

Can you share your input VCF with me for further debugging?

Reda94 commented 3 years ago

@susannasiebert here's one example vcf. Thanks! mysample.genotyped.vep.vcf.zip

Reda94 commented 3 years ago

@susannasiebert I digged a bit into it and I think the issue of the empty filter.tsv files has to do with the inclusion of either --net-chop-method cterm, --netmhc-stab, or --run-reference-proteome-similarity; instead of --binding-threshold 10000000. I am not sure what's wrong with these parameters: --net-chop-method cterm, --netmhc-stab, or --run-reference-proteome-similarity.

susannasiebert commented 3 years ago

Are you sure you're running on version 2.0.3? A problem that had the same symptom was resolved in version 2.0.1 (https://pvactools.readthedocs.io/en/latest/releases/2_0.html#version-2-0-1). I'm running with all of the parameters on the 2.0.3 docker container and it's taking quite a bit of time to run these three steps which wouldn't be the case if any of these steps returned an empty file.

From inside of your singularity container, what does pip show pvactools return?

Reda94 commented 3 years ago

Hi @susannasiebert, I ran pip show pvactools and this is what I got:

Name: pvactools
Version: 2.0.3
Summary: A cancer immunotherapy tools suite
Home-page: https://github.com/griffithlab/pVACtools
Author: Jasreet Hundal, Susanna Kiwala, Joshua McMichael, Yang-Yang Feng, Christopher A. Miller, Aaron Graubert, Amber Wollam, Connor Liu, Jonas Neichin, Megan Neveau, Jason Walker, Elaine R. Mardis, Obi L. Griffith, Malachi Griffith
Author-email: help@pvactools.org
License: BSD-3-Clause-Clear
Location: /usr/local/lib/python3.7/site-packages
Requires: mhcflurry, networkx, biopython, swagger-spec-validator, simanneal, PyVCF, pandas, tensorflow, Pillow, pysam, PyYAML, flask-cors, jsonschema, pymp-pypi, tornado, requests, connexion, bokeh, wget, vaxrank, mhcflurry, watchdog, mhcnuggets, mhcnuggets, py-postgresql, mock
Required-by:

I am indeed running the 2.0.3 version that contains the MHCflurry bug fix (the one you created after I reported the original error in this thread).

Did you manage to get results from the vcf I sent? I also noticed that when I run the command interactively (i.e. not as a submitted slurm job) it takes quite a bit of time. However when the command is ran in a submitted job I always get an empty .filtered.tv file. Do NetMHCstab, NetChop and/or the reference proteome blast need an internet connection to run?

Reda94 commented 3 years ago

@susannasiebert I confirmed that the issue (i.e. getting an empty .filtered.tsv file containing only the header) occurs when I submit multiple singularity pvacseq jobs in parallel (one job per sample). I honestly struggle to understand why this happens... Could this be because of some netMHCstabpan server restrictions on the number of requests? Also, I had a look at the netmhc_stab.py script (https://github.com/griffithlab/pVACtools/blob/master/lib/netmhc_stab.py) and noticed that the config file referenced in line 67 does not exist within my pvactools singularity container; in fact /var/www does not even exist. Is this normal?

susannasiebert commented 3 years ago

I'm not sure, to be honest. Can you provide some more information about how you kick off your parallel jobs so I can try to reproduce this on my end?

RE the config file. This is submitted as a parameter to the NetMHCstabpan API and is a file on their server, not in the docker container.

Reda94 commented 3 years ago

@susannasiebert I use a bash loop to submit the same pvacseq script for each sample so they are all submitted one after the other but practically they run in parallel. I have been trying to figure this out the whole weekend but I still struggle to understand why I get an empty .filtered.tsv whenever I launch my jobs this way. When I test for individual samples, or even for two, the returned .filtered.tsv is not empty and contains the netMHCstabpan predictions...

susannasiebert commented 3 years ago

And you execute your bash loop from inside of your docker container (as opposed to launching parallel docker run jobs)?

Reda94 commented 3 years ago

I execute the bash loop outside the singularity container, i.e. the bash loop launches the singularity exec command

Reda94 commented 3 years ago

@susannasiebert is there a way to run netMHCstabpan (and netChop) without making use of the web APIs? i.e. run them as local installations from within pvactools

susannasiebert commented 3 years ago

Unfortunately, pVACseq only supports running these tools via their API.

Reda94 commented 3 years ago

And may I know what the exact command to make a call to, say, nethMHCpan is within pvactools?

susannasiebert commented 3 years ago

The logic to call an IEDB algorithm can be found here: https://github.com/griffithlab/pVACtools/blob/master/lib/prediction_class.py#L53.

susannasiebert commented 3 years ago

ok, I tried to replicate the issue you're describing as follows:

#test single run without NetMHCstabpan
docker run -v /Users/ssiebert/Documents/Work/pVACtools:/data griffithlab/pvactools:2.0.3 pvacseq run /data/pvacseq_example_data/input.vcf Test HLA-A*02:01 MHCflurry /data/test_out_10 -e1 9

#test single run with NetMHCstabpan
docker run -v /Users/ssiebert/Documents/Work/pVACtools:/data griffithlab/pvactools:2.0.3 pvacseq run /data/pvacseq_example_data/input.vcf Test HLA-A*02:01 MHCflurry /data/test_out_11 -e1 9 --netmhc-stab

#test multiple runs
for i in 12 13 14 15 16; do docker run -v /Users/ssiebert/Documents/Work/pVACtools:/data griffithlab/pvactools:2.0.3 pvacseq run /data/pvacseq_example_data/input.vcf Test HLA-A*02:01 MHCflurry /data/test_out_$i -e1 9 --netmhc-stab; done

#check outputs
wc -l test_out_1*/MHC_Class_I/Test.filtered.tsv
       4 test_out_10/MHC_Class_I/Test.filtered.tsv
       4 test_out_11/MHC_Class_I/Test.filtered.tsv
       4 test_out_12/MHC_Class_I/Test.filtered.tsv
       4 test_out_13/MHC_Class_I/Test.filtered.tsv
       4 test_out_14/MHC_Class_I/Test.filtered.tsv
       4 test_out_15/MHC_Class_I/Test.filtered.tsv
       4 test_out_16/MHC_Class_I/Test.filtered.tsv

So doing that I'm unable to replicate the issue you're seeing. Can you share with me the script you use to launch your runs?

Reda94 commented 3 years ago

@susannasiebert Thanks very much for trying. Please find attached the scripts I use to launch my pvacseq runs. The laumching command is the following:

bash launching_loop.sh -i /cluster/working/Reda/pvacseq_runs/vcf_paths.txt \
-o /cluster/working/Reda/pvacseq_runs/out \
-f /cluster/working/Reda/ref/hg19.fasta \
-t NetMHCpan \
-s false \
-c /cluster/working/Reda/pVACtools_installation_no_MHCflurry_bug/pvactools_latest.sif

where launching_loop.sh is the loop bash script that reads the input vcfs from vcf_paths.txt (each line in this file corresponds to one vcf input = one sample). Each loop launches a slurm job whose script is main_pvacseq_script.sh (this script first annotates the vcfs in the right format if necessary then calls pvacseq).

Thanks for your help.

scripts_singularity_pvacseq.zip

susannasiebert commented 3 years ago

I was able to replicate your issue and as you suspected, NetMHCpan and NetChop limit the number of jobs you can run concurrently on their server: I get the following message:

You have reached the limit of queued and active jobs for this service from a single site.<br>
In other words, you already have jobs submitted to our service.<br>
Please wait until your current job(s) are finished and resubmit again.

I will try and work on a way to retry requests when this message is encountered.

Reda94 commented 3 years ago

Thanks! Great that this is confirmed... What I don't get is 1) why I do not get this message and 2) why pvacseq finishes its run without any error but with an empty .filtered.tsv file... Would be great to find a way around this indeed. Probably the best long term solution would be to make calls to netMHCstabpan and netChop using their local installations without any cgi/api call whatsoever.

Reda94 commented 3 years ago

@susannasiebert on a side note, is there a way to know what the maximum number of jobs/requests the netMHCstabpan/netChop cgi accepts at any one time?

Reda94 commented 3 years ago

@susannasiebert also, maybe, a way to mitigate this could be to increase the chunk size to post less requests for the same (current) sample? https://github.com/griffithlab/pVACtools/blob/71cd12351cafc73057be87e8c717b1be59801ca1/lib/netmhc_stab.py#L43

susannasiebert commented 3 years ago

You don't get this message because the api still returns a 200 and we just assumed it succeeded and try to parse the output (which contains the above message). This inadvertently results in the empty output file as well. I will also address that part of the problem.

It looks like their API only allows one concurrent job.

susannasiebert commented 3 years ago

I just made a new release (2.0.4) that should resolve the issue you're seeing with running multiple jobs in parallel while using NetMHCstabpan/NetChop. I'm closing this issue but please feel free to reopen should you still encounter problems.

griffithlab / pVACtools

ValueError: could not convert string to float #691