griffithlab / pVACtools

http://www.pvactools.org
BSD 3-Clause Clear License
144 stars 59 forks source link

KeyError: 46 #970

Closed brycemash closed 1 year ago

brycemash commented 1 year ago

Describe the bug

Parsing binding predictions for Allele H-2-Kb and Epitope Length 10 - Entries 12601-12800
Parsed Output File for Allele H-2-Kb and Epitope Length 10 (Entries 12601-12800) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 8 - Entries 12801-13000
Parsed Output File for Allele H-2-Db and Epitope Length 8 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 9 - Entries 12801-13000
Parsed Output File for Allele H-2-Db and Epitope Length 9 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 10 - Entries 12801-13000
Parsed Output File for Allele H-2-Db and Epitope Length 10 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Kb and Epitope Length 8 - Entries 12801-13000
Parsed Output File for Allele H-2-Kb and Epitope Length 8 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Kb and Epitope Length 9 - Entries 12801-13000
Parsed Output File for Allele H-2-Kb and Epitope Length 9 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Kb and Epitope Length 10 - Entries 12801-13000
Parsed Output File for Allele H-2-Kb and Epitope Length 10 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 8 - Entries 13001-13200
Parsing prediction file for Allele H-2-Db and Epitope Length 8 - Entries 13001-13200
Parsing prediction file for Allele H-2-Db and Epitope Length 8 - Entries 13001-13200 - Completed
Parsing binding predictions for Allele H-2-Db and Epitope Length 9 - Entries 13001-13200
Parsing prediction file for Allele H-2-Db and Epitope Length 9 - Entries 13001-13200
Traceback (most recent call last):
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/bin/pvacseq", line 8, in <module>
    sys.exit(main())
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/tools/pvacseq/main.py", line 116, in main
    args[0].func.main(args[1])
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/tools/pvacseq/run.py", line 133, in main
    pipeline.execute()
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/pipeline.py", line 434, in execute
    split_parsed_output_files = self.parse_outputs(chunks)
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/pipeline.py", line 395, in parse_outputs
    parser.execute()
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/output_parser.py", line 438, in execute
    iedb_results = self.process_input_iedb_file(tsv_entries)
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/output_parser.py", line 362, in process_input_iedb_file
    iedb_results = self.parse_iedb_file(tsv_entries)
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/output_parser.py", line 584, in parse_iedb_file
    if protein_identifiers_from_label[protein_label] is not None:
KeyError: 46

To Reproduce

vep \ --input_file broad/dunnlab/BLM/pvac/Mutation/SNP/Annotation/SR_CT2.GATK.snp.mm10_multianno.gt.vcf \ --output_file broad/dunnlab/BLM/pvac/Mutation/SNP/Annotation/SR_CT2.GATK.snp.mm10_multianno.gt.vep.vcf \ --format vcf --vcf --symbol --terms SO --tsl \ --hgvs --fasta broad/dunnlab/BLM/pvac/Mus_musculus.GRCm38.dna.primary_assembly.fa \ --offline --cache \ --dir_cache broad/dunnlab/BLM/pvac \ --plugin Frameshift --plugin Wildtype \ --dir_plugins /broad/dunnlab/BLM/pvac/VEP_plugins \ --species mus_musculus --cache_version 102

error found on this function

pvacseq run \ broad/dunnlab/BLM/pvac/Mutation/SNP/Annotation/SR_CT2.GATK.snp.mm10_multianno.gt.vep.vcf \ CT2_snp \ H-2-Kb,H-2-Db \ MHCflurry MHCnuggetsI MHCnuggetsII NNalign NetMHC PickPocket SMM SMMPMBEC SMMalign \ broad/dunnlab/BLM/pvac/output/CT2_snp_2 \ -e1 8,9,10 -e2 15 Log Output

Parsing binding predictions for Allele H-2-Db and Epitope Length 8 - Entries 12601-12800
Parsed Output File for Allele H-2-Db and Epitope Length 8 (Entries 12601-12800) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 9 - Entries 12601-12800
Parsed Output File for Allele H-2-Db and Epitope Length 9 (Entries 12601-12800) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 10 - Entries 12601-12800
Parsed Output File for Allele H-2-Db and Epitope Length 10 (Entries 12601-12800) already exists. Skipping
Parsing binding predictions for Allele H-2-Kb and Epitope Length 8 - Entries 12601-12800
Parsed Output File for Allele H-2-Kb and Epitope Length 8 (Entries 12601-12800) already exists. Skipping
Parsing binding predictions for Allele H-2-Kb and Epitope Length 9 - Entries 12601-12800
Parsed Output File for Allele H-2-Kb and Epitope Length 9 (Entries 12601-12800) already exists. Skipping
Parsing binding predictions for Allele H-2-Kb and Epitope Length 10 - Entries 12601-12800
Parsed Output File for Allele H-2-Kb and Epitope Length 10 (Entries 12601-12800) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 8 - Entries 12801-13000
Parsed Output File for Allele H-2-Db and Epitope Length 8 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 9 - Entries 12801-13000
Parsed Output File for Allele H-2-Db and Epitope Length 9 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 10 - Entries 12801-13000
Parsed Output File for Allele H-2-Db and Epitope Length 10 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Kb and Epitope Length 8 - Entries 12801-13000
Parsed Output File for Allele H-2-Kb and Epitope Length 8 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Kb and Epitope Length 9 - Entries 12801-13000
Parsed Output File for Allele H-2-Kb and Epitope Length 9 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Kb and Epitope Length 10 - Entries 12801-13000
Parsed Output File for Allele H-2-Kb and Epitope Length 10 (Entries 12801-13000) already exists. Skipping
Parsing binding predictions for Allele H-2-Db and Epitope Length 8 - Entries 13001-13200
Parsing prediction file for Allele H-2-Db and Epitope Length 8 - Entries 13001-13200
Parsing prediction file for Allele H-2-Db and Epitope Length 8 - Entries 13001-13200 - Completed
Parsing binding predictions for Allele H-2-Db and Epitope Length 9 - Entries 13001-13200
Parsing prediction file for Allele H-2-Db and Epitope Length 9 - Entries 13001-13200
Traceback (most recent call last):
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/bin/pvacseq", line 8, in <module>
    sys.exit(main())
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/tools/pvacseq/main.py", line 116, in main
    args[0].func.main(args[1])
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/tools/pvacseq/run.py", line 133, in main
    pipeline.execute()
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/pipeline.py", line 434, in execute
    split_parsed_output_files = self.parse_outputs(chunks)
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/pipeline.py", line 395, in parse_outputs
    parser.execute()
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/output_parser.py", line 438, in execute
    iedb_results = self.process_input_iedb_file(tsv_entries)
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/output_parser.py", line 362, in process_input_iedb_file
    iedb_results = self.parse_iedb_file(tsv_entries)
  File "/broad/dunnlab/BLM/conda_libraries/pvactools_conda/lib/python3.6/site-packages/pvactools/lib/output_parser.py", line 584, in parse_iedb_file
    if protein_identifiers_from_label[protein_label] is not None:
KeyError: 46

Expected behavior Expect file CT2_snp_2/MHC_Class_I/tmp/CT2_snp.H-2-Db.9.parsed.tsv_13001-13200 to be created.

susannasiebert commented 1 year ago

Thank you for the bug report, @brycemash, and I apologize that you're running into trouble with running pVACtools. It looks like your run was maybe a retry? Do you know what happened on the initial run? This error usually indicates that there is a mismatch between the predictor output file and the input data. I suspect that maybe the file is empty of was concatenated. If you can find the appropriate prediction files in the tmp directory (matching the allele, epitope length and entry chunk) you can try deleting them and rerun again. If the problem persists I would need you to send me the original input VCF and/or the output directory for further debugging.

brycemash commented 1 year ago

I deleted outputs and reran, which fixed the problem. Thanks!