UCSC-MedBook / patient-care

Clincian facing portal showing pathways, signatures, and genes of interest
2 stars 1 forks source link

Need dynamic strategy for multiple variants #92

Open tedgoldstein opened 8 years ago

tedgoldstein commented 8 years ago

Here is another bioinformatics issue for data sets. Handle multiple co-resident variants. For exmaple there are (at least) six variants of the APOBEC1 gene (which often uses the symbol A1CF). Most people have all of these variants.

NCBI label Hugo label NM_138933 A1CF NM_014576 A1CF NM_138932 A1CF NM_001198820 A1CF NM_001198818 A1CF NM_001198819 A1CF

Sometimes multiple variants need to be treated as separate genes, sometimes they should be averaged and treated as once gene. There are probably other strategies.

Rob and Holly should comment.

rbaertsch commented 8 years ago

We should store the number of variants per gene and also store the individual variants. For protein coding variants, they are commonly stored as offset. Each isoform could have a different offset. For non-coding variants, store the offset. Important to store the truncating variants also.

hbeale commented 8 years ago

I'm not sure I understand the scope of the question, but everything Robert says sounds good to me. The variant nomenclature scheme at http://varnomen.hgvs.org/ is clunky but pretty good.

mokolodi1 commented 7 years ago

@tedgoldstein are the things in the first column of that table transcript labels? It doesn't seem to me that this has a compelling use case in our current roadmap.