I'd like to open an issue to map out ideas about the general workflow for the analysis. Please edit or comment if you have suggestions. A general goal is to move away from using Seurat, if possible, and replace the workflow used in barnyard_data, with our own approach. To do this, we will need to implement the following:
1) Load in the cell-associated 10x mRNA data and the cell associated and background haircut data into a single R object.
[X] For now use the MultiAssayExperiment class from BioC. (See create_haircut)
2) Generate QC plots to examine haircut signals across each hairpin from cell-associated and background barcodes.
[x] add QC plots (reuse plotting code from @mandylr)
3) Implement filtering function to exclude signal from background droplets (see #2)
[ ] add filtering function
4) Implement functionality to generate 2D cell projections via uMAP (see uzot for a Rcpp implementation) or tSNE and clustering via simple k-means for now.
[x] Add normalization method for RNA (simple colSums approach should be fine)
[x] Add normalization method for haircut signal (not sure what's best?, perhaps the centered log ratio method used in CITE-Seq ? )
[x] Add a scaling and PCA function
[x] Add a kmeans wrapper
[x] Add a uMAP/tSNE wrapper
5) Plotting function to visualize cells in uMAP/PCA/tSNE plot. Needs to color cells by mRNA, haircut, and other categorical data (e.g. sample name).
[x] plotting function (likely can easily modify plot_feature from lung scRNA project)
6) Add in a statistical test to test for differences in haircut/mRNA signal between clusters. Wilcox.test works pretty well.
I'd like to open an issue to map out ideas about the general workflow for the analysis. Please edit or comment if you have suggestions. A general goal is to move away from using Seurat, if possible, and replace the workflow used in barnyard_data, with our own approach. To do this, we will need to implement the following:
1) Load in the cell-associated 10x mRNA data and the cell associated and background haircut data into a single R object.
MultiAssayExperiment
class from BioC. (Seecreate_haircut
)2) Generate QC plots to examine haircut signals across each hairpin from cell-associated and background barcodes.
3) Implement filtering function to exclude signal from background droplets (see #2)
4) Implement functionality to generate 2D cell projections via uMAP (see uzot for a Rcpp implementation) or tSNE and clustering via simple k-means for now.
5) Plotting function to visualize cells in uMAP/PCA/tSNE plot. Needs to color cells by mRNA, haircut, and other categorical data (e.g. sample name).
plot_feature
from lung scRNA project)6) Add in a statistical test to test for differences in haircut/mRNA signal between clusters. Wilcox.test works pretty well.