epigen / unsupervised_analysis

A general purpose Snakemake workflow to perform unsupervised analyses (dimensionality reduction & cluster analysis) and visualizations of high-dimensional data.
MIT License
17 stars 2 forks source link

clustification: Benchmark clf-based clustering approach #12

Open sreichl opened 11 months ago

sreichl commented 11 months ago

look for clustering benchmark datasets (from various domains) to test the approach and put the result into the documentation) → Clustering benchmark papers

sreichl commented 10 months ago
sreichl commented 10 months ago

check if scRNA-seq data from SCCAF Teichmann paper works: https://www.nature.com/articles/s41592-020-0825-9 specifically their benchmarking data: https://github.com/SCCAF/sccaf_example

sreichl commented 10 months ago

cellxgene: https://cellxgene.cziscience.com/ HCA Data portal: https://data.humancellatlas.org/

sreichl commented 9 months ago

Start "easy" with a small and a large PBMC data set i.e., very clearly defined "ground truth"

sreichl commented 5 months ago

compare to their scRNA-seq specific clustering approach (quite similar ie iterative RFs) https://www.biorxiv.org/content/10.1101/2024.01.18.576317v1.full