Closed JohnReid closed 10 years ago
Hi, John Could you be a bit more explicit about what you are doing? We can discuss off list if you'd like.
zlm has some problems with single-cell RNA seq data, due to sporadic
outliers in the continuous component even when there is very little
evidence for expression. This can lead to inflated p-values and false
positives. We are working on several approaches to improve the model for
single-cell rna seq data. One thing that's implemented now that you could
try is using empirical Bayes shrinkage of the variance, by setting
ebayes=TRUE
in the zlm call.
Can you tell me a bit more in what sense the p-values you get are not
sensible, maybe provide an example?
Cheers, Greg
On Thu, Aug 28, 2014 at 8:57 AM, JohnReid notifications@github.com wrote:
I've got some single-cell RNA-seq counts (normalised with DESeq2) which I'd like to model with your mixture of continous and discrete components. I'm having some problems getting any sensible p-values from zlm.SingleCellAssay and I'm not finding the intro vignette or the package documentation particularly helpful. Is there any chance you could put together a small example, as you did for the fluidigm assay?
Thanks, John.
— Reply to this email directly or view it on GitHub https://github.com/RGLab/SingleCellAssay/issues/39.
Hi, Thanks for the quick response. I've managed to work out that I had a typo in my hypothesis
argument to zlm.SingleCellAssay
. I had inadvertently capitalised Time
which slipped by without a warning. So sorry for the noise.
After fixing that, I'm getting some p-values to explore and I'm having a look at them now. I'll definitely try the empirical Bayes option so thanks for that suggestion. Also I don't know if you have any experience with DESeq2
, but if you have an opinion on whether normalising the counts using size factors makes sense before passing them to SingleCellAssay
I would be pleased to hear it.
Many thanks for providing the package, John.
Just a quick question or two about the arguments to the SingleCellAssay
constructor.
Thanks!
No thresholding is done by default. You could threshold, but it's difficult to say where that threshold should be. It's something we are working on.
Differential expression is on the log transformed scale. We model the data as log-normal.
You can construct the object with transformed or untransformed data. The concept of "layers" in the object lets you deal with different transformations of the data. On Sep 1, 2014 2:13 AM, "JohnReid" notifications@github.com wrote:
Just a quick question or two about the arguments to the SingleCellAssay constructor.
- Am I right in assuming they should be on an untransformed scale (i.e. not log-transformed)?
- How is zero expression fit by the model? Does it fit small expression values as 0 or do they actually have to be 0? I ask becuase my normalised counts look like the following (on the log-scale). I have some zero counts but I also want to model those small counts as 0.
[image: index] https://cloud.githubusercontent.com/assets/1790516/4106260/c6043af4-31b7-11e4-9c18-302bdfd76e49.png
Thanks!
— Reply to this email directly or view it on GitHub https://github.com/RGLab/SingleCellAssay/issues/39#issuecomment-54037577 .
I've got some single-cell RNA-seq counts (normalised with DESeq2) which I'd like to model with your mixture of continous and discrete components. I'm having some problems getting any sensible p-values from zlm.SingleCellAssay and I'm not finding the intro vignette or the package documentation particularly helpful. Is there any chance you could put together a small example, as you did for the fluidigm assay?
Thanks, John.