nmdp-bioinformatics / py-ard

HLA ARD Reduction in Python
https://py-ard.org/
GNU Lesser General Public License v3.0
17 stars 13 forks source link

`A*68:AJEBX` doesn't contain invalid allele and is not reversible #313

Closed pbashyal-nmdp closed 8 months ago

pbashyal-nmdp commented 8 months ago

I just discovered another example that's making me wonder if there's more to it.

"Round-trip behaviour" as described above by @mmaiers-nmdp works for A*68:AJEBX on https://hml.nmdp.org/MacUI/: that system is perfectly happy to first decode and than encode the result, and does not complain of invalid alleles.

But it fails for py-ard:

ard.lookup_mac(ard.expand_mac("A*68:AJEBX"))
File "~/code/py_ard/venv/lib/python3.11/site-packages/pyard/ard.py", line 867, in lookup_mac
    raise InvalidMACError(f"{allelelist_gl} does not have a MAC.")
pyard.exceptions.InvalidMACError: Invalid MAC Code: A*68:01/A*68:07/A*68:08/A*68:11N/A*68:17/A*68:19/A*68:21/A*68:24/A*68:32/A*68:33/A*68:37/A*68:38/A*68:47/A*68:52/A*68:55/A*68:66/A*68:69/A*68:70/A*68:76/A*68:91/A*68:93/A*68:95/A*68:96/A*68:98/A*68:100/A*68:101/A*68:102/A*68:103/A*68:107/A*68:111/A*68:112/A*68:114/A*68:116/A*68:118/A*68:119/A*68:120N/A*68:121/A*68:123/A*68:132 does not have a MAC.

The decoding result is the same for py-ard and https://hml.nmdp.org/MacUI/ (unlike A*01:AABJE, the example I gave previously), so the difference seems to be in the lookup/encoding. I'm using IPD-IMGT/HLA db 3.54.0 on both py-ard and the website.

Originally posted by @lcreteig in https://github.com/nmdp-bioinformatics/py-ard/issues/249#issuecomment-1981311816