statgen / locuszoom

A Javascript/d3 embeddable plugin for interactively visualizing statistical genetic data from customizable sources.
https://statgen.github.io/locuszoom/
MIT License
154 stars 29 forks source link

Allow users to filter genes track by biotype #199

Closed abought closed 4 years ago

abought commented 4 years ago

Replaces #18

Purpose

Our plot has limited room to display genes in a standard figure, and some users have suggested that it might be nice to intelligently choose which elements are shown, as not all gene entries are of equivalent broad interest (see: gencode list of biotypes). For example, the PheWeb nearest gene annotator uses a specific subset of options.

The new LZ filtering feature can be used to render only user-selected biotypes.

Requirements:

Rather than allowing the user to drag and reorder genes, they would thus be selecting the data to render based on the intended meaning/ most common use case.

abought commented 4 years ago

Goncalo suggests a similar list to pheweb.

protein_coding category as definite must. Excluding pseudogenes would be nice; might want to include mirnas. ig c/d/g good to include. For RNA, keep Mt_tRNA, Mt_rRNA.

A long term stretch goal would be a checkbox list for fine grained control.

For UI, option lists would be called "all" or "genes". (excluding micro, non coding, etc)

IG_*_GENE, protein-coding, TR_*_GENE, rRNA, Mt_?RNA


pjvandehaar commented 4 years ago

For reference, here are the counts on hg38 for gencode v34:

19959   protein_coding
144 IG_V_gene
18  IG_J_gene
14  IG_C_gene
37  IG_D_gene
106 TR_V_gene
79  TR_J_gene
6   TR_C_gene
4   TR_D_gene
22  Mt_tRNA
2   Mt_rRNA
52  rRNA
8   ribozyme
# Total: ~20,000

16899   lncRNA
2212    misc_RNA
1901    snRNA
1881    miRNA
1061    TEC
943 snoRNA
49  scaRNA
5   sRNA
1   vaultRNA
1   scRNA
# Total: ~25,000

10175   processed_pseudogene
2629    unprocessed_pseudogene
927 transcribed_unprocessed_pseudogene
501 rRNA_pseudogene
498 transcribed_processed_pseudogene
188 IG_V_pseudogene
137 transcribed_unitary_pseudogene
97  unitary_pseudogene
42  polymorphic_pseudogene
33  TR_V_pseudogene
18  pseudogene
9   IG_C_pseudogene
4   TR_J_pseudogene
3   IG_J_pseudogene
2   translated_processed_pseudogene
1   translated_unprocessed_pseudogene
1   IG_pseudogene
# Total: ~15,000
abought commented 4 years ago

Thanks Peter- these counts could be useful if we decide to add more than two groups of filters. (which I think might become a real user request after release)

I'll push a new release to my.locuszoom.org tomorrow AM with this feature, so we can get real user feedback!