After #3, we'll continue with tSNE / UMAP and clustering. So this will involve creating code/09_snRNA-seq_re-processed/03_clustering.R.
Briefly, Erik computed tSNE / UMAP and used the PCs he got from the poisson pearson residuals then ran 4 graph-based clustering options: with k nearest neighbors 5, 10, 20 and 50. Then he plotted a few marker genes and chose k = 20 to continue.
Lines 286 to 337 are the ones for plotting the marker genes
Later Erik has other sets of genes in lines 379 to 383 and visualizes them in lines 385 to 403.
I think that we should use the strategy Matt and Louise used and go with k = 20 from the beginning. Let's see what the marker gene plots look like for the set of:
Erik's markers for Habenula
Erik's sets of nicotine, opiod, alcohol and cocaine genes
We could also use the DLPFC marker genes Matt & Louise used, just out of curiosity
We could also compare the resulting prelimCluster and collapsedCluster labels with the clusters Erik had created. This can be done with addmargins(table()) for example or with a heatmap.
We'll examine these plots with everyone involved.
Our resulting object from this script should have the prelimCluster and collapsedCluster labels. We'll then make a new one that will add the cellType and cellType.broad columns based on what we decide with everyone on how to label each collapsedCluster (or similar spelling: use the ones Matt & Louise have in the final published objects, not the intermediate column names; Louise can tell you which ones they are) .
After #3, we'll continue with tSNE / UMAP and clustering. So this will involve creating
code/09_snRNA-seq_re-processed/03_clustering.R
.Briefly, Erik computed tSNE / UMAP and used the PCs he got from the poisson pearson residuals then ran 4 graph-based clustering options: with
k
nearest neighbors 5, 10, 20 and 50. Then he plotted a few marker genes and chosek = 20
to continue.Matt and Louise:
k = 20
for building their shared-nearest neighbor graph https://github.com/LieberInstitute/10xPilot_snRNAseq-human/blob/master/10x_DLPFC-n3_step02_clust-annot_LAH.R#L139-L149 to generate theirprelimCluster
s. They then check them https://github.com/LieberInstitute/10xPilot_snRNAseq-human/blob/master/10x_DLPFC-n3_step02_clust-annot_LAH.R#L151-L179.prelimCluster
s intocollapsedCluster
s https://github.com/LieberInstitute/10xPilot_snRNAseq-human/blob/master/10x_DLPFC-n3_step02_clust-annot_LAH.R#L186-L284.collapsedCluster
s, they visualize a few marker genes in order to label them https://github.com/LieberInstitute/10xPilot_snRNAseq-human/blob/master/10x_DLPFC-n3_step02_clust-annot_LAH.R#L288-L300.I think that we should use the strategy Matt and Louise used and go with
k = 20
from the beginning. Let's see what the marker gene plots look like for the set of:We could also compare the resulting
prelimCluster
andcollapsedCluster
labels with the clusters Erik had created. This can be done withaddmargins(table())
for example or with a heatmap.We'll examine these plots with everyone involved.
Our resulting object from this script should have the
prelimCluster
andcollapsedCluster
labels. We'll then make a new one that will add thecellType
andcellType.broad
columns based on what we decide with everyone on how to label eachcollapsedCluster
(or similar spelling: use the ones Matt & Louise have in the final published objects, not the intermediate column names; Louise can tell you which ones they are) .