exomiser / Exomiser

A Tool to Annotate and Prioritize Exome Variants
https://exomiser.readthedocs.io
GNU Affero General Public License v3.0
202 stars 55 forks source link

Possible changes to Genomiser filtering and viz #219

Open damiansm opened 7 years ago

damiansm commented 7 years ago

Current nc variant behaviour is:

  1. Any variants found in the regulatory_regions db table of known regulatory regions from FANTOM and Ensembl regulatory build AND an effect of INTERGENIC_VARIANT or UPSTREAM_GENE_VARIANT get the effect changed to REGULATORY_REGION_VARIANT. This is to stop them getting removed in step 3 (? if Jannovar ever assigns this effect - I don't think so)
  2. Any variants with an effect of REGULATORY_REGION_VARIANT get reassigned to the gene in the TAD with the best pheno score
  3. The regulatoryFeature filter removes any variant with an effect of INTERGENIC_VARIANT or UPSTREAM_GENE_VARIANT AND >= 20kb away from gene

Max's preferred behaviour

  1. Reassign variants to best gene in TAD for most nc variants
    • CODING_TRANSCRIPT_INTRON_VARIANT
    • CONSERVED_INTERGENIC_VARIANT
    • CONSERVED_INTRON_VARIANT
    • DOWNSTREAM_GENE_VARIANT
    • INTERGENIC_REGION
    • INTERGENIC_VARIANT
    • INTRAGENIC_VARIANT
    • INTRON_VARIANT
    • NON_CODING_TRANSCRIPT_INTRON_VARIANT
    • REGULATORY_REGION_VARIANT
    • TF_BINDING_SITE_VARIANT
    • UPSTREAM_GENE_VARIANT
  2. Don't filter variants based on being >= 20 kb from a gene and not in a FANTOM/Ensembl reg build feature but rather use ReMM < 0.5 instead.

Peter's preferred display behaviour

  1. More detail on what REGULATORY_REGION_VARIANT means and/or a better name as the other types are regulatory variants as well. Could we link to a suitable external resource such as Ensembl regulatory build e.g. http://grch37.ensembl.org/Homo_sapiens/Location/View?r=6%3A7261639-7261639 where regulatory build is a default track and shows this variant marked as regulatory_region_variant for RREB1 is predicted to be a Promoter Flanking Region. Linking to http://grch37.ensembl.org/Homo_sapiens/Regulation/Summary?db=core;fdb=funcgen;r=6:7261639-7261639;rf=ENSR00000260522 would give more relevant detail but not sure how to automate. We would have to store the ENSR ids in the db table
  2. Provide a more detailed breakdown and viz of the various UTR effects such as upstream ORFs, KOZAK etc
damiansm commented 7 years ago

@visze @julesjacobsen @pnrobinson I combined the 3 prev issues discussing this into one new issue as they are all inter-related and to simplify the discussion!