Closed lcolladotor closed 2 years ago
Note that 20210323_human_hb_neun.R
lines 97 to 121 run emptyDrops()
across all the samples together. Something Matt has found might not work as well in some cases and why at https://github.com/LieberInstitute/10xPilot_snRNAseq-human/blob/master/10x_all-FACS-n10_2021rev_step01_processing-QC_MNT.R#L99 he runs it one sample at a time. Matt has convinced us to do this for the deconvolution snRNA-seq data with Louise. So well, we could potentially decide to change this, which means reading the data from processed-data/07_cellranger
and starting the R objects from scratch instead of using 20210525_human_hb_processing.rda
.
We should add to the QC script the calculation of the doubletScore
using scDblFinder
. That's from Matt's code at https://github.com/LieberInstitute/10xPilot_snRNAseq-human/blob/master/10x_all-FACS-n10_2021rev_step01_processing-QC_MNT.R#L354-L444. Matt computes this one sample at a time, which is what we should do here. Since we'll have a single large SCE object, we can use a loop and subset it to each sample. Something like:
sce$new_variable <- NULL
for(i in unique(sce$sample_id)) {
sce_sub <- sce[, sce$sample_id == i]
## compute something
results ## a vector of results
sce$new_variable[sce$sample_id == i] <- results
}
Something like the above would be useful for running emptyDrops()
one sample at a time. Louise @lahuuki might need to this also on the deconvolution snRNA-seq (I haven't created those issues yet!).
Make sure this initial SCE object includes the sample ids, sex, region (well, it's all Habenula), diagnosis, age info. Kind of like https://github.com/LieberInstitute/10xPilot_snRNAseq-human/blob/master/10x_all-FACS-n10_2021rev_step01_processing-QC_MNT.R#L463-L472 or like https://github.com/LieberInstitute/Visium_IF_AD/blob/master/code/04_build_spe/build_basic_spe.R#L32-L45 that gets added at https://github.com/LieberInstitute/Visium_IF_AD/blob/master/code/04_build_spe/build_basic_spe.R#L129-L140.
Based from our late 2021 meetings with Erik, we decided to re-process the snRNA-seq data. See https://jhu-genomics.slack.com/archives/G019V8X9CVC/p1639688649081100 for the messages from back then on Slack.
Add new QC code to GitHub
code/09_snRNA-seq_re-processed
code/09_snRNA-seq_re-processed/01_qc.R
. Save the PDF in the correspondingplots/09_snRNA-seq_re-processed
directory.This uses
20210525_human_hb_processing.rda
which was created with lines 35 to 125 from20210323_human_hb_neun.R
.At the end of this, we have in our SCE object the columns:
discard_auto
: these are the cells we'll discard moving forward (the ones withTRUE
).discard_auto
.Continue processing
This involves adapting code Erik wrote at
20210323_human_hb_neun.R
.20210525_human_hb_processing.rda
).spatialLIBD
since that will give us the same gene info that we have in our Visium SPE objects.sce_post_qc.Rdata
or something like that. Similar to his file at line 225.You could run this interactively, or use
sgejobs::job_single()
to create a companion shell script to run the QC code.This should mark the end of the QC steps.