AlexsLemonade / training-modules

A collection of modules that are combined into 1-5 day workshops on computational topics for the childhood cancer research community.
Other
61 stars 28 forks source link

Add cell clustering to scRNA-seq #455

Closed jashapiro closed 3 years ago

jashapiro commented 3 years ago

We do not currently do anything directly related to identifying clusters (cell types) in single cell data. While actually assigning cell types is a challenging and potentially experiment-specific question, some basic clustering & associated marker gene identification (we already do this with "known" cell types) would be likely be useful to participants.

We can add some basic clustering (with caveats!) using as inspiration the methods presented in https://bioconductor.org/books/release/OSCA/clustering.html

There are many methods! I am currently agnostic as to which we choose, but if anyone has strong opinions, please add them here!

jashapiro commented 3 years ago

After playing around some, I think my plan is mostly the following:

I am going to reduce the amount of time we spend on dimensionality reduction, eliminating the UMAP experimentation to get through all of that a bit faster. We will move to storing the results of dimensionality reduction in the SCE object, so we can plot with them more easily later (the plotReducedDim() function is handy and has pretty good defaults, despite oddly requiring us to use colour)

The experimentation/parameters matter discussion will move to a clustering notebook, where I plan to discuss k-means clustering and graph-based clustering briefly. The former because it is relatively easy to understand, I hope, and the latter because it seems to be very commonly used. And the differences by parameters can be pretty dramatic, for better and worse.

I will implement all of this with the bluster package, as it seems quite flexible and easy to use.