openvax / mhcflurry

Peptide-MHC I binding affinity prediction
http://openvax.github.io/mhcflurry/
Apache License 2.0
191 stars 57 forks source link

KeyError when predicting affinities #170

Closed jfnavarro closed 4 years ago

jfnavarro commented 4 years ago

Hi,

I keep getting this error in two of my samples.

[4244 rows x 2 columns] Predicting processing. Predicting affinities. Traceback (most recent call last): File "/home/jose.fernandez.navarro/anaconda3/envs/jared/bin/mhcflurry-predict-scan", line 10, in <module> sys.exit(run()) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/predict_scan_command.py", line 324, in run throw=not args.no_throw) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/class1_presentation_predictor.py", line 715, in predict_sequences throw=throw) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/class1_presentation_predictor.py", line 514, in predict throw=throw) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/class1_presentation_predictor.py", line 181, in predict_affinity throw=throw) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/class1_affinity_predictor.py", line 982, in predict model_kwargs=model_kwargs File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/class1_affinity_predictor.py", line 1169, in predict_to_dataframe **model_kwargs) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/class1_neural_network.py", line 1062, in predict 'peptide': self.peptides_to_network_input(peptides) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/class1_neural_network.py", line 409, in peptides_to_network_input **self.hyperparameters['peptide_encoding']) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/encodable_sequences.py", line 186, in variable_length_to_fixed_length_vector_encoding allow_unsupported_amino_acids=allow_unsupported_amino_acids)) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/encodable_sequences.py", line 411, in sequences_to_fixed_length_index_encoded_array lambda s: numpy.array([ File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/pandas/core/series.py", line 3630, in map new_values = super()._map_values(arg, na_action=na_action) File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/pandas/core/base.py", line 1145, in _map_values new_values = map_f(values, mapper) File "pandas/_libs/lib.pyx", line 2329, in pandas._libs.lib.map_infer File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/encodable_sequences.py", line 412, in <lambda> get_amino_acid_index(char) for char in s File "/home/jose.fernandez.navarro/anaconda3/envs/jared/lib/python3.6/site-packages/mhcflurry/encodable_sequences.py", line 412, in <listcomp> get_amino_acid_index(char) for char in s KeyError: 'c'

Do you have any idea what could be causing it? A non-aminoacid character in the protein sequences perhaps?

Thanks in advance!

jfnavarro commented 4 years ago

The --no-throw option did not help

jfnavarro commented 4 years ago

Okay, I figured it out. There were non amino-acid characters in one of the protein sequences.