The-Sequence-Ontology / SO-Ontologies

Collect of SO Ontologies
Creative Commons Attribution 4.0 International
95 stars 37 forks source link

LAMHDI SO review ticket #4: ‘sequence variant’ #319

Open srynobio opened 9 years ago

srynobio commented 9 years ago

The current definition of ‘sequence variant’ is “a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration” A. As suggested for ‘sequence alterations’, SO should clarify the ‘extent of sequence’ covered by a variant, and its relation to a ‘sequence alteration’. Our reading of the sequence variant definition and documentation suggests that variants extend beyond the altered sequence to cover some larger sequence region such as a gene (e.g. a gene with a point mutation is a sequence variant of the gene - and the sequence alteration is part_of the variant). Amending the definition to read as follows may help here: “a (non-exact copy of a) sequence feature or genome, which exhibits one or more sequence alterations that distinguish it from other variants of that feature or genome with which it is grouped in a variant collection.” B. Another point of confusion is how a variant is different from the more fundamental type of feature that it is a variant of (ie how is an instance of a ‘gene structure variant’ different than a ‘gene’)? One view is that the distinction results from a variant being considered in the context of some larger collection of variants of a particular sequence feature (ie a ‘variant collection’). If this is the case, adding some further explanatory comment may offer further clarification: “Sequence variants represent sequence features or collections (typically genes or genomes, respectively), that are considered in the context of some collection of variants of a particular feature or genome. A sequence variant exhibits one or more sequence alterations that distinguish it from other variants in this collection. The extent of a sequence variant is some definable sequence feature or collection (e.g. a gene, transcript, promoter, or genome) that contains a sequence alteration. For example, a ‘transcript stability variant’ refers to the entire transcript sequence that contains some alteration affecting the stability of the transcript. Thus, the extent of a variant is greater than that of the sequence alteration that results in the existence of the variant, and that is a part_of the variant.” C. The addition of superclass axioms may further clarify the distinctions above in a precise and computable manner: has_part some ‘sequence alteration’ member_of some ‘variant collection’ Also, specific variant classes could be given have axioms indicating them to be a ‘varaint_of’ the appropriate more fundamental class. (e.g. ‘gene structure variant’ would contain an axiom that it is a ‘variant_of’ some ‘gene’) D. Finally, if it is the case that all instances of a particular class of ‘sequence variant’ are also instances of the more fundamental class whose instances they are variants of (e.g. that all instances of ‘gene structure variant’ are also instances of the class ‘gene’), SO could implement this multiple parentage For example, ‘sequence variant’ could be made a defined class (def = any sequence that is a “variant_of some sequence feature”). Specific ‘sequence variant’ classes could then be asserted as their more fundamental type, and inferred to be ‘sequence variants’ (e.g. ‘gene structure variant’ asserted as a type of ‘gene’, and inferred as a type of ‘sequence variant’). Of course, it may be that ‘sequence variants’ are not sequences in the same sense as ‘sequence regions’ - in which case this multiple parentage would be inappropriate

srynobio commented 9 years ago

Response from Matthew Brush

With respect to point D above: Upon further consideration this is not advised, as some alterations that create variants will result in a sequence that is no longer functional and therefore no longer represents an instance of the type it is 'variant_of'. For example, a alteration that creates a null allele of a gene (ie an inactive version) - here the resulting sequence does no qualify as a 'gene' as it does not produce a functional transcript. This idea may also have relevance for point B above, in that specifying variants to be 'considered as part of some variant collection' may not be necessary to explain why variants are different than the sequences they are 'variant_of'.

srynobio commented 9 years ago

Response from @keilbeck

Hi Matt Since we got the travel grant to work with you on these issues, I am making all the lamdi requests 'pending' --K