yerkes-gencore / scRNAseq_template

Template for ENPRC Genomics Core single cell RNAseq projects
1 stars 0 forks source link

Gene omission on capture or study level #12

Open derrik-gratz opened 4 months ago

derrik-gratz commented 4 months ago

Workflow currently includes omitting genes found in less than 50 cells, but this is done on a per capture level. This causes some genes to be omitted from captures and present in others, which may affect DGE across conditions. Granted, these genes probably don't have much data to work with, but it would still be better to avoid this just in case

micahpf commented 4 months ago

I don't think this is a problem for our pseudobulk workflow though, since we're running edgeR::filterByExpr anyway, which flags low/no expression genes in each sample for removal from the model fitting procedure.

derrik-gratz commented 4 months ago

I agree that it shouldn't affect pseudo bulk, but I'm not sure if that is our 'defacto' workflow yet. I could see a niche scenario where it informs clustering in some way.

I just implemented it to the package (not pushed yet), do you think there'd be a reason to NOT do it? E.g. information loss?

micahpf commented 4 months ago

I'm not sure. Maybe you could just open a draft pull request while we do some more digging?