Closed ahwagner closed 7 months ago
If a choice is arbitrary, then policy and/or guidance would certainly help.
Ideally, the representations should be bidirectionally transformable. If only unidirectional transformation is possible, then the common representation could serve as the canonical, normalized form and the other could be a convenience.
Transformations are bidirectional for this. It is a longstanding issue in the field, addressing the question of "when does a variant become an SV?"
In most contexts, it is just an arbitrary length cutoff; <n
is a small variant (Allele), >=n
is a structural variant (Adjacencies). 50 is a typical value for n
.
You can imagine that a SNV in VRS can be represented as an Allele located at start: x
, end: y
with state s
. But it could just as easily be an Adjacency ending at x
, beginning at y
with a linker sequence of s
. The first form is much more compact, but either is valid. So we need guidance on when SHOULD use one form or the other.
While I think we can and should define the issue clearly and state what is commonly done, I don't think it is our job to define a standard policy. We can define the rule we use in our implementations like vrs-Python APIs. And we can possibly go as far as recommending how it should be done but groups like ACMG or other groups can work to define a credible policy that the global community might adopt. I don't want us to spend too much time trying to define a policy that is really a recommendation until a credited agency can weigh in with more formal guidance.
In VRS, we use
Allele
objects for small variant representations, andAdjacencies
for "structural" variation. But the distinction between the two has always been arbitrary. We should have a policy–or at least guidance–on when to use one structure or the other.For assayed variants, it could be as simple as a short statement, e.g.: if the variant starts and ends on the same reference sequence and is fully spanned by the assay technology, use Alleles. If not, use Adjacencies.
We might also consider additional exposition to cover other cases.