Calculate counts-per-million from depth-corrected SCTtransformed `v2` counts.

fjrossello commented 1 year ago

Hi Team,

I am trying to use a mixed effects model (treatment as fixed and donor as random effect) to identify differentially expressed genes between 3 conditions with MAST. According to one of MAST's vignettes "MAST performs best with log-transformed, scale-normalized data that has been thresholded, such as log2(transcripts per million+1)" (See here for details). My question is whether it is sound to use log2 CPMs calculated from counts (counts slot of a SCT assay from a Seurat object) that have been depth-corrected using SCTransform v2 (recorrected using PrepSCTFindMarkers).

Thanks in advance.

Fernando

saketkc commented 1 year ago

I would recommend not scaling the corrected counts to million counts as in my internal tests, it seems to perform poorly (higher number of false positives, at least with the wilcoxon test). But running mast on the data slot (log(corrected counts) where corrected counts use minimum of the median sequencing depths across the datasets) seems reasoanble.

fjrossello commented 1 year ago

Thanks for your prompt reply and advice. Cheers, Fernando

satijalab / sctransform

Calculate counts-per-million from depth-corrected SCTtransformed `v2` counts. #142