HelenaLC / muscat

Multi-sample multi-group scRNA-seq analysis tools
158 stars 32 forks source link

unexpected results: mixed models for B cells #56

Open victorkleb opened 3 years ago

victorkleb commented 3 years ago

Following the vignette

http://www.bioconductor.org/packages/release/bioc/vignettes/muscat/inst/doc/analysis.html

gave consistent results through step 2.4 (data preparation) yielding a set with 7,118 rows and 26,820 columns.

The pseudo pulk analysis with edgeR produced results for B cells consistent with section 3.3

Mixed model methods 2 and 3 -- vst and nbinom – gave unexpected results.
Results were exported to csv files. The vst file has 6163 rows, nbinom has 6181 rows

plger commented 3 years ago

Hi, could you share the data for a minimal reproducible example? And can you tell us which exact muscat version you're using?

victorkleb commented 3 years ago

I followed the vignette in this posting through section 2.3 “Preprocessing” to obtain a SCE with 7,118 rows and 26,820 columns: http://www.bioconductor.org/packages/release/bioc/vignettes/muscat/inst/doc/analysis.html

Version: 1.2.1

victorkleb commented 3 years ago

sorry - I did not mean to close this issue.

plger commented 3 years ago

So the strange values using vst come from sctransform : it seems like there is a cap on the maximum values. I guess however this shouldn't affect the results very much (none of these genes is likely to be anywhere close to significance). We're investigating the nbinom issue.

victorkleb commented 3 years ago

Thank you.

With vst a user would be forced to rely on p-values (assuming that they are valid), not the statistics that they are presumably derived from. Also, as you see, with the strange values, the statistics are not (negatively) correlated with p-values.

When comparing methods, the statistics may be more helpful than p-values.