xinhe-lab / GSFA

R package that performs sparse factor analysis and differential gene expression discovery simultaneously on single-cell CRISPR screening data.
https://xinhe-lab.github.io/GSFA/
MIT License
19 stars 2 forks source link

Determining prior parameters #4

Open sofiedemeyer opened 1 year ago

sofiedemeyer commented 1 year ago

Dear,

I am trying your software on my own dataset. I have around 29000 cells in which in total around 16000 genes are detected. Moreover, there are 65 genes that are perturbed. What is the best way to determine the prior parameters for fit0? In my last execution I used prior_beta_s = 20 (and the others as specified in the vignette), but I still have a lot of genes with lsfr equal to 0, which makes it hard to identify the real impact..

Thank you in advance!

LifanLiang commented 11 months ago

This is a tricky issue. The Gaussian mixture prior in GSFA is quite robust to various settings. I think there are two issues you can check:

(1) Considering that your dataset has many more target genes than those in the vignette, the first thing you can try is increasing the number factors, say k=70. Usually it is better to have too many factors than too few of them. The latter would lead to many false positives;

(2) lfsr may cause inflation when input data is not sufficiently Normal even after transformation. You may need to retain the posterior samples of beta and calculate your own P(beta!=0) in this case.