Closed julia326 closed 6 years ago
I think the issue was that mhcnames 0.1.0 (version used to generate the training data a while ago) had this erroneous behavior:
In [2]: mhcnames.normalize_allele_name("SLA01")
Out[2]: 'HLA-SLA*01:01'
When this output is run through the current version of mhcnames we get the error you see.
It's not clear this error was really fixed though. The current master version of mhcnames gives an error instead of parsing it:
In [4]: mhcnames.normalize_allele_name("SLA-01")
---------------------------------------------------------------------------
AlleleParseError Traceback (most recent call last)
<ipython-input-4-39557fbce5ea> in <module>()
----> 1 mhcnames.normalize_allele_name("SLA-01")
/Users/tim/sinai/git/mhcnames/mhcnames/normalization.py in normalize_allele_name(raw_allele, omit_dra1)
70 if raw_allele in _normalized_allele_cache[omit_dra1]:
71 return _normalized_allele_cache[omit_dra1][raw_allele]
---> 72 parsed_alleles = parse_classi_or_classii_allele_name(raw_allele)
73 species = parsed_alleles[0].species
74 normalized_list = [species]
/Users/tim/sinai/git/mhcnames/mhcnames/class2.py in parse_classi_or_classii_allele_name(name)
49 "Allele has too many parts: %s" % name)
50 if len(parts) == 1:
---> 51 parsed = parse_allele_name(name, species)
52 if parsed.species == "HLA" and parsed.gene.startswith("DRB"):
53 alpha = AlleleName(
/Users/tim/sinai/git/mhcnames/mhcnames/allele_name.py in parse_allele_name(name, species_prefix)
118 raise AlleleParseError("No MHC gene name given in %s" % original)
119 if len(name) == 0:
--> 120 raise AlleleParseError("Malformed MHC type %s" % original)
121
122 gene = gene.upper()
AlleleParseError: Malformed MHC type 01
@iskandr is that desired behavior?
If not we should probably open an mhcnames issues
Closing this. SLA-01
is apparently a locus (e.g. HLA-A) not really an allele so I think it's not obviously incorrect for mhcnames to raise an error in this case.
This should be more error-tolerant, or have a more informative error message:
(n this case, the offending allele in the input turned out to be "HLA-SLA01"):