Closed AviMaayan closed 4 years ago
Avi, I've done this but some of these terms don't match up with existing tags. So I'm trying to clean up the tags, let me know if there is something missing / wrong.
I ran a text-similarity clustering on the tags to get:
Cluster centroid: Cluster; Members
Harmonizome ETL Script: Drugmonizome ETL Script; Harmonizome ETL Script
Mouse Genome Informatics: Mouse Genome Informatics
Venn diagrams: Gene Singatures; Protein kinases; SuperVenn diagrams; Venn diagrams; gene predictions
Fisher's Exact Test: Fisher's Exact Test
Drugmonizome: Drug Repurposing Hub; Drug binding; DrugBank; DrugCentral; Drugmonizome; Harmonizome; drug screen; drug set libraries
RNA-seq: ATC Codes; BioPlex; RNA-Seq; RNA-seq; RNAseq; Reactome; TAS Vectors; miRNA targets; mimickers; reversers; scRNA-seq
enrichment analysis: Enrichment Analysis; Kinase Enrichment Analysis; enrichment analysis; gene network analysis
scatter plot: UpSet plots; manhattan plot; scatter plot; scatterplot; small molecules
canvas: Achilles; Allen Brain Map; BioGPS; BrainSpan Atlas; ClinVar; Databases; GeneRIF; Geneshot; Jensen Lab; L1000FWD; Orphanet; PharmGKB; TargetScanHuman; Visualization; WikiPathways; bar chart; bioinformatics; canvas; drugs; example; hu.MAP; miRTarBase
Gene Expression, Age, Mouse,Human: Gene Expression, Age, Mouse,Human
COSMIC (Mutations): COSMIC (Mutations)
Comparative Toxicogenomics Database (Chemical): Comparative Toxicogenomics Database (Chemical); Comparative Toxicogenomics Database (Disease)
Gene Ontology: GWAS Catalog; Gene Knock-Down; Gene Ontology; Guide to Pharmacology; Human Phenotype Ontology
Cancer Cell Line Encyclopedia: Cancer Cell Line Encyclopedia
Roadmap Epigenomics: Roadmap Epigenomics
Human Metabolome Database: Human Metabolome Database
SIDER: ARCHS4; Bgee; CLIP-SEQ; CORUM; CREEDS; DSigDB; Enrichr; GTEx; KEA3; KINOMEscan; L1000; OFFSIDES; PHGKB; RDKIT; SIDER; STITCH; TCGA; bokeh; ncRNA
The Human Protein Atlas (Immunihistochemistry): The Human Protein Atlas (Immunihistochemistry)
COSMIC (Copy Number Variants): COSMIC (Copy Number Variants)
The Human Protein Atlas (RNA-Seq): The Human Protein Atlas (RNA-Seq)
gene correlation network: gene correlation network
machine learning: Machine Learning; Pathway Commons; machine learning
The Cancer Genome Atlas: The Cancer Genome Atlas
Aging, Dementia, and Traumatic Brain Injury Study: Aging, Dementia, and Traumatic Brain Injury Study
Genetic Association Database: Genetic Association Database
Below is the substitutions I plan on applying to the tags (left is what it was, and right is what I'm replacing it with)
s_g = {
'Aging, Dementia, and Traumatic Brain Injury Study': {'Aging'},
'bar chart': {},
'bioinformatics': {},
'bokeh': {},
'canvas': {},
'Databases': {},
'Drug binding': {'Pharmacology'},
'drug screen': {'Pharmacology'},
'drug set libraries': {'Pharmacology'},
'Drugmonizome ETL Script': {'Drugmonizome', 'ETL Script'},
'drugs': {'Pharmacology'},
'enrichment analysis': {'Enrichment Analysis'},
'gene correlation network': {},
'Gene Expression, Age, Mouse,Human': {'Aging'}
'gene network analysis': {},
'gene predictions': {'Predictions'},
'Gene Singatures': {},
'Harmonizome ETL Script': {'Harmonizome', 'ETL Script'},
'KINOMEscan': {'KINOMEscan', 'Kinome'},
'L1000FWD': {'L1000', 'L1000FWD'},
'machine learning': {'Machine Learning'},
'manhattan plot': {},
'mimickers': {'Pharmacology'},
'miRNA targets': {'microRNAs'},
'Protein kinases': {'Kinome'},
'reversers': {'Pharmacology'},
'RNA-Seq': {'RNA-seq'},
'RNAseq': {'RNA-seq'},
'scatter plot': {},
'scatterplot': {},
'small molecules': {},
'SuperVenn diagrams': {},
'The Cancer Genome Atlas': {'TCGA'},
'The Human Protein Atlas (RNA-Seq)': {'The Human Protein Atlas', 'RNA-seq'},
'UpSet plots': {},
'Venn diagrams': {},
'Visualization': {},
"Fisher's Exact Test": {},
}
The empty sets mean I would be removing those tags.
These are the tags that should be listed below the search bar: RNA-seq scRNA-seq Enrichr Machine Learning TCGA Harmonizome Drugmonizome L1000 Compare Sets microRNAs Kinome Aging