Open wbwakeman opened 3 years ago
Where this fits in the big picture (TimD meeting 1/29/2021)
Massive matrix Service (IDK) has every cell we know about partitioned into datasets
is an input to :
Transcriptomics clustering (this project)
is an input to:
Taxonomy Cluster service (created recently by Platform team) has an organization of clusters into taxonomies
The transcriptomics clustering functionality is currently bundled into a single R package called scrattch.hicat. To improve performance, reliability, maintainability and extensibility, we will translate the critical parts of this code into a series of Python modules that will run as a pipeline. The MVP will be an implementation of the functionality contained with cluster.R, especially iter_clust .
regression/normalization Select High Variance Genes Dimension Reduction Filter known modes Clustering Merging Hierarchical Sorting (UPGMA)
Two notes: