TobiTekath / DTUrtle

Perform differential transcript usage (DTU) analysis of bulk or single-cell RNA-seq data. See documentation at:
https://tobitekath.github.io/DTUrtle
GNU General Public License v3.0
18 stars 3 forks source link

does # of cells matter when determining DTU between 2 clusters? #25

Open m-noonan opened 6 months ago

m-noonan commented 6 months ago

Hello, I was curious if the number of cells used in a comparison matter when performing the DTU analysis. For example, I compared a cluster of ~1000 cells to a cluster of ~100 cells and got many more significant genes and transcripts (>300) than when I compared a cluster of ~1000 cells to a cluster of ~900 cells (<50 signifiant genes/transcripts). I can't tell if it's the biology or the statistics that is causing this. Thanks for any insight you can provide!

TobiTekath commented 3 months ago

Hi @m-noonan, please excuse the very late reply.

Theoretically a higher number of cells should give a higher statistical power for the analysis and thus could lead to more significant genes and transcripts. But of course the results are also highly dependent on the groups of cells you are comparing, as well as the parameters and thresholds of your analysis.

Just as a sanity check, if you randomly subsample your ~900 cell cluster to e.g. only 200 cells, there should not be more but less significant genes and transcripts.

m-noonan commented 3 months ago

Got it, thank you!