Closed jashapiro closed 3 years ago
After playing around some, I think my plan is mostly the following:
I am going to reduce the amount of time we spend on dimensionality reduction, eliminating the UMAP experimentation to get through all of that a bit faster. We will move to storing the results of dimensionality reduction in the SCE object, so we can plot with them more easily later (the plotReducedDim()
function is handy and has pretty good defaults, despite oddly requiring us to use colour
)
The experimentation/parameters matter discussion will move to a clustering notebook, where I plan to discuss k-means clustering and graph-based clustering briefly. The former because it is relatively easy to understand, I hope, and the latter because it seems to be very commonly used. And the differences by parameters can be pretty dramatic, for better and worse.
I will implement all of this with the bluster
package, as it seems quite flexible and easy to use.
We do not currently do anything directly related to identifying clusters (cell types) in single cell data. While actually assigning cell types is a challenging and potentially experiment-specific question, some basic clustering & associated marker gene identification (we already do this with "known" cell types) would be likely be useful to participants.
We can add some basic clustering (with caveats!) using as inspiration the methods presented in https://bioconductor.org/books/release/OSCA/clustering.html
There are many methods! I am currently agnostic as to which we choose, but if anyone has strong opinions, please add them here!