Closed lgragert closed 9 months ago
Top priority - we should make a version of the output files that exclude the two-field allele mismatch columns and only provides ARD-level allele mismatch, so as not to confuse Keith and Ryan.
This won't be as complicated as allele-level TRS, because we're choosing one pair per multiple imputation replicate.
Created columns that begin with ARD_* to distinguish it from two-field level typing.
Typing columns called: ARD_REC_*_1,2
and ARD_DON_*_1,2
Allele MM columns called ARD_*_ALLELE_MM
(where *=locus)
The full allele level mismatch variables have limited utility because of limitations of the NMDP typing data / haplotype frequencies.
Modify
srtr_hla_antigen_mm.py
to also compute ARD-level allele mismatch.Roll up alleles to ARD level using pyARD (lgx redux type), then compare the strings. https://github.com/nmdp-bioinformatics/py-ard