Open bnbowman opened 10 years ago
There can be an arbitrary number of "consensus-sequence" elements (1..*) in an HML message. As part of the process of incorporating the MIRING attributes into HML I think we could also include alternates.
What we (really) need is the ability to represent an assembly graph. FASTG is one proposed solution to this. Global Alliance is working on another. A goal for HML would be to offer the ability to represent anything these formats do/can.
Work-in-progress GA4GH variation reference schema variationReference.avdl
See also http://arxiv.org/abs/1404.5010
I.e. A 3k consensus sequence that could contain "AGGGGGGGGA" or "AGGGGGGGA" but would otherwise be identical.
Current spec does not seem sufficient since it appears to allow only 1-2 sequences, where-as the simplest solution, supplying all possible sequences is not possible if (A) the gene is also diploid or (B) if there is more than 2 possible such sequences (i.e. 2 ambiguous positions)