Open thecatgonewithwid opened 2 years ago
Balancing your data is an interesting question. We never benchmarked it but the premise behind modelling the counts should avoid this as being an issue. With regards to marker genes, in theory the same principles should hold, so you could use it for this. However, it might make things difficult to compare with existing atlases of your tissue.
Hi,
A lot of thanks to you and your team for the great contributions to singlecell DE analysis and making this wonderful package !
I was using Libra to run DE analysis in my own sc-seq dataset.However I have a few questions about how these data type present below influences the final statistical power in finding real DE genes(pesudo methods)
Type one : Imbalanced cell number data when a certain celltype number vary dramatically between biological replicates .
For example:
data like this
Biologicalreplicates | Celltype num | Label SampleA | 2000 | control SampleB | 3000 | control SampleC | 4000 | control SampleD | 1000 | case SampleE | 2000 | case SampleF | 1500 | case
Question : Can i choose a cell number ,for instance 1000 or even samller one as a new celltype number for every Biologicalreplicates ,and then resample every Biologicalreplicates to make a balanced data for pesudo-bulk ?
Type two : DE analysis between different celltype
Question : In my understandings , pesudo-methods are better than singcell-methods in the circumstance of making DE within a certain celltype ,is it also a good method in the circumstance of making DE between different celltype (find important marker gene)?
Forgive my poor english expression and awful question format , Hope to get your reply !
Thanks Yufeng