Open victorkleb opened 3 years ago
Hi, could you share the data for a minimal reproducible example? And can you tell us which exact muscat version you're using?
I followed the vignette in this posting through section 2.3 “Preprocessing” to obtain a SCE with 7,118 rows and 26,820 columns: http://www.bioconductor.org/packages/release/bioc/vignettes/muscat/inst/doc/analysis.html
Version: 1.2.1
sorry - I did not mean to close this issue.
So the strange values using vst
come from sctransform
: it seems like there is a cap on the maximum values. I guess however this shouldn't affect the results very much (none of these genes is likely to be anywhere close to significance).
We're investigating the nbinom issue.
Thank you.
With vst a user would be forced to rely on p-values (assuming that they are valid), not the statistics that they are presumably derived from. Also, as you see, with the strange values, the statistics are not (negatively) correlated with p-values.
When comparing methods, the statistics may be more helpful than p-values.
Following the vignette
http://www.bioconductor.org/packages/release/bioc/vignettes/muscat/inst/doc/analysis.html
gave consistent results through step 2.4 (data preparation) yielding a set with 7,118 rows and 26,820 columns.
The pseudo pulk analysis with edgeR produced results for B cells consistent with section 3.3
Mixed model methods 2 and 3 -- vst and nbinom – gave unexpected results.
Results were exported to csv files. The vst file has 6163 rows, nbinom has 6181 rows
Since exploratory analysis of alternative methods for DE discovered several gene count distributions that adversely effected results (specifically: counts concentrated on very few cells – example: HBB in B cells) counts were checked for the first two. Both are very sparse, with total B cell counts equal to 51 and 22 for BTN2A1 and IRAK1_ENSG00000184216, respectively.
Hence, the counts themselves do not appear to be responsible for the nbinom results.