Open hinerm opened 5 years ago
labAssignmentSection
The XML hierarchy of the haplotype allele storage for a single locus is:
testResult
(e.g. HLA-A)
hlaCallList
(result combinations)hlaTestCall
(a specific strand 1 + strand 2 allele combination)
hlaTestCallGroup
(results for a single strand in the combination)The hlaCallList
can contain multiple hlaTestCalls
:
When there are multiple entries like this, we should look for a this was MANUALLY chosen
flag in the footnote list:
and that is the hlaTestCall
we should pull allele information from.
If a testCallList
contains multiple hlaTestCalls
and none have that footnote, just skip the check including that locus (e.g. if there are multiple C calls and none were manually selected, we wouldn't compute any B-C haplotype probabilities)
Referenced here.
XML Sections to parse
It looks like the XML has an
labAssignmentSection
block at the top which includes the assigned typings. These can be read and passed to aValidationModelBuilder
to populate the core typing data.For haplotype information, there is a
testResultsSection
that has all the possible alleles for each locus, which will be used to populate haplotype information.Implementation Notes
Document
, so building the parser is all about finding the appropriate XMLElements
, getting their text contents, and passing them appropriately to the builder.Goals