nmdp-bioinformatics / dash

Data Standards Hackathon for NGS based typing.
GNU Lesser General Public License v3.0
13 stars 13 forks source link

HML Spec: If two seqences show up in a consensus-sequence, but have different lengths (due to deletions for instance), is this okay? Remember that there is only a single targeted-region which has a specific length start/end length that will no longer necessarily match either of the DNA sequence lengths. Is it okay if the two sequence lengths are different and if they don't match the targeted-region? #10

Closed dvaliga closed 9 years ago

dvaliga commented 9 years ago

In multiple conversations, it was thought that mismatched sequence lengths is okay, and that targeted-region is the INTENDED region, not necessarily exactly what was reported. The change here might just be to update documentation to indicate that region is for INTENDED region. Other options are to add a targeted-region for each sequence or use some special character to indicate deletes to keep the sizes constant.

ghost commented 9 years ago

Fixed in HML version 1.0 with cardinality 0..* for consensus-sequence-block elements, see https://github.com/nmdp-bioinformatics/hml/issues/26