huangyh09 / brie

BRIE: Bayesian Regression for Isoform Estimate in Single Cells
https://brie.readthedocs.io
Apache License 2.0
41 stars 15 forks source link

split cells into groups and run brie2 #60

Open caochch opened 1 year ago

caochch commented 1 year ago

Dear Authors,

I have sequenced, let's say, 1000 cells. I run brie2 in two ways: Strategy1, run brie2-quant for 1000 cells; Strategy2, run brie2-quant for 1-500 cells and for 501-1000 cells separately.

Afterwards, I compared the PSI value of the same event from Strategy1 V.S. that from Strategy2. The results showed that the average PSI for 1-500 and 501-1000 cells were almost equal in Strategy 1, but the average PSI for 1-500 and 501-1000 cells were obviously different in Strategy 2. I guess that the PSI distribution are different if brie2-quant were run separately for 1-500 and 501-1000 cells (as brie2 jointly model all cells at once) . Am I correct?

This kind of difference may drive the separation of 1-500 cells and 500-1000 cells if I use Seurat to cluster cells. Is it acceptable? What's your suggestion?

Looking forward to your reply and have a nice weekend. Changchang Cao, caochch@gmail.com

caochch commented 1 year ago

Besides, How could I evaluate whether the single cell sequencing depth is enough or the calculated PSI is robust? Do you have any suggestions?

huangyh09 commented 1 year ago

Hi @caochch, thanks for the questions. Your understanding is all correct.

In your strategy 2 with running two subsets separately, they will learn a prior for each case (primarily to the mean of the sub-population). This group-specific prior is critical for lowly cover cells, as there are not many reads observed to support its own PSI.

If you are using Seurat to combine these two subgroups, it will introduce offsets for each gene between these two groups. I would suggest using Strategy 1.

For the sequencing depth, it is hard to comment but you may use the confidence interval to assess each PSI; usually, if the 95% confidence interval is <0.3, it can be treated as reliable. Similarly, you may use the number of unique reads to perform the filtering.

Hope it helps. Yuanhua

caochch commented 1 year ago

Thanks very much for your valuable answer.