dgidb / dgidb-v5

Providing interactions between drugs and genes sourced from a variety of publications and knowledgebases
https://dgidb.org
MIT License
14 stars 2 forks source link

Category structure/organization #132

Open cjosu opened 2 years ago

cjosu commented 2 years ago

At the 5/16 meeting, we spoke about ways to re-organize Druggable Gene Categories for v5. I've done a preliminary analysis, mostly for my own benefit, but hopefully this spurs on ideas about how we could update the approach.

First let me address a point I need clarification on. Among all the sources, there seems to be at least 2 distinct notions of what constitutes a category.

1. Clinically Actionable + Drug Resistance

Between CarisMolecularIntelligence, FoundationOneGenes, Oncomine, and Tempus, there is only one category (CLINICALLY ACTIONABLE):

image

CIViC and Cosmic also contain CLINICALLY ACTIONABLE, but add DRUG RESISTANCE:

image

2. Druggable Genome

HingoraniCasas and Russ Lampel both have DRUGGABLE GENOME as their sole category: image

(note: Go also contains DRUGGABLE GENOME as a category, while introducing many more unique categories)

3. All the rest

The remaining categories seem to be more commensurate between the remaining sources (although the clear majority derive from two places- Go and HopkinsGroom)

image

Question:

Are we correct in displaying categories like CLINICALLY ACTIONABLE, DRUG RESISTANCE, or DRUGGABLE GENOME as distinct categories from all the rest? Or do they potentially encompass many of the other categories?

UNIQUE CATEGORIES:

I'm also including a list of categories which are present in just one source. We might start with a couple of these when consolidating items into more general groupings. Although some of them (ENZYME?) might represent broader categories in themselves.

Go DNA REPAIR GROWTH FACTOR, HISTONE MODIFICATION, HORMONE ACTIVITY, RNA DIRECTED DNA POLYMERASE , SERINE THREONINE KINASE, TRANSCRIPTION FACTOR BINDING, TRANSCRIPTION FACTOR COMPLEX, TYROSINE KINASE

HopkinsGroom B30_2 SPRY DOMAIN, CYTOCHROME, DNA DIRECTED RNA POLYMERASE, EXCHANGER, FIBRINOGEN, LIPASE, THIOREDOXIN

HumanProteinAtlas ENZYME

I'll continue to update this issue as I mock up the page this week.

malachig commented 2 years ago

It might be helpful to include some discussion/commentary on the druggable genome page that tries to place these various sources and sub-categories in context. My recollection of the history of developing this was that we started with the Hopkins/Groom list. We just wanted to make that idea more accessible, update linkage to gene identifiers, etc. Then integrate it with updates that had the same premise (Russ/Lampel). Where the idea is that these are genes that have biochemical or cell biology features that make them inherently promising as targets of drugs. Irrespective of whether any drug has actually been identified to target them yet. Then we added GO as a way of extending this idea. Druggable genes according to biochem/cell biology properties are enriched for certain categories of genes. GO provides a source of gene categories that are being constantly updated. We manually curated the categories in GO that seemed to line up with the philosophy of Hopkins/Groom, Russ/Lampel. Then we had the thought that other sources were developing lists of genes with documented clinical actionability (knowledgebases) or presumed clinical actionability (sequencing panels that include them). These could probably be separated into two piles. The genes with established clinical actionability vs. those with possible clinical actionability. That would be a bit of a project to define those.

Some additional quick thoughts: