Open kshefchek opened 5 years ago
The notion of a genotype as defined in GENO describes variation in the state of the genome of a particular cell. When a genotype is linked to an organism, there is an implicit assumption that all cells in the organism have the same genome/genotype. This assumption overlooks the possibility that somatic mutations, mosaicism or chimerism can lead to organisms comprised of distinct populations of cells with different genomes/genotypes. The mosaic ClinVar @kshefchek references here is a real use case/example that challenges this simplifying assumption. So the question here is how to deal with this ClinVar scenario in our modeling - where we want to provide some 'variation entity' to annotate as pathogenic for Ehlers Danos Syndrome (EDS).
For your purposes @kshefchek, would you be happy if GENO provided a class that represented the combined genotype information for genetically different populations of cells within a single organism (e.g. two lines of cells comprising a mosaic or chimeric organism)? Instances of this class would then be what you would annotate to a disease in cases like the ClinVar record above.
Not sure yet what we would call this class, or how we would classify it (as a special case of a genotype, or a set of genotypes as found in different populations of cells in a single organism) - but we can deal with this once we settle on the concept that needs to be represented.
I'm interested in modelling this ClinVar record: https://www.ncbi.nlm.nih.gov/clinvar/RCV000087646/
I assume mosaicism is more accurately described as a mode of inheritance rather than a genotype, but having some object and/or datatype properties specific to mosaic inheritance could be useful, for example, percentage of one genotype vs another.