hammerlab / cohorts

Utilities for analyzing mutations and neoepitopes in patient cohorts
Apache License 2.0
20 stars 4 forks source link

KeyError when predicting neoantigens #247

Open jburos opened 7 years ago

jburos commented 7 years ago

Encountered an error when using expressed_neoantigen_count in the course of an analysis:

This did not occur for all patients, but did occur for approximately 3 patients.

Here is a sample of the relevant output:

expressed_neoantigen_count:  18%|████████████████████████████████████▋                                                                                                                                                                          | 14/79 [01:04<12:12, 11.27s/it]
Traceback (most recent call last):
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 536, in <module>
    main()
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 528, in main
    fulldict[allele] = printresults(NNpred, seqs, inorder, preds, allele, methodlist[allele], peplen, wwwrun)
KeyError: 'HLA-B07:02'
Traceback (most recent call last):
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 536, in <module>
    main()
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 528, in main
    fulldict[allele] = printresults(NNpred, seqs, inorder, preds, allele, methodlist[allele], peplen, wwwrun)
KeyError: 'HLA-B07:02'
Traceback (most recent call last):
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 536, in <module>
    main()
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 528, in main
    fulldict[allele] = printresults(NNpred, seqs, inorder, preds, allele, methodlist[allele], peplen, wwwrun)
KeyError: 'HLA-B07:02'
Traceback (most recent call last):
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 536, in <module>
    main()
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 528, in main
    fulldict[allele] = printresults(NNpred, seqs, inorder, preds, allele, methodlist[allele], peplen, wwwrun)
KeyError: 'HLA-B07:02'
Traceback (most recent call last):
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 536, in <module>
    main()
  File "/demeter/users/burosj01/projects/netmhc-bundle/netMHC-3.4/netMHC.py", line 528, in main
    fulldict[allele] = printresults(NNpred, seqs, inorder, preds, allele, methodlist[allele], peplen, wwwrun)
KeyError: 'HLA-C07:02'

... etc.

The set of HLA alleles causing this error includes: 'HLA-B07:02', 'HLA-C07:02', 'HLA-A03:01', 'HLA-B44:02', 'HLA-A68:01' .

On further inspection, this occurs for patients with no expressed variants. Not sure if this is an error in the context of that analysis or not -- that's a separate issue. For now, I want to catch & address this situation appropriately.

As noted elsewhere, one limitation of our current caching strategy is that we lose potentially-informative output like this once these results have been cached. I don't yet have a good solution to this, but will create an issue for this (to capture logger output when caching items).