MangiolaLaboratory / sccomp

Testing differences in cell type proportions from single-cell data.
https://stemangiola.github.io/sccomp/
GNU General Public License v3.0
94 stars 7 forks source link

Use the numerical generative process to calibrate the model #14

Open CastielZhao opened 3 years ago

CastielZhao commented 3 years ago

Does the false positive rate we claim (e.g. 0.05) correspond to 5% of false positives given our no-association, no-outlier simulated data?

Calibration:

stemangiola commented 3 years ago

Calibrate inference of associations

CastielZhao commented 3 years ago

"Setup coefficient to have same intercept (for simplicity), and zero slope" Are there any other constraints on coefficient? i.e. integer ? Range ? Also, I assume that "zero slope" means coeff=(beta0,beta0,...,beta0; beta1,beta1,...,beta1); that the first column repeats 20 times.

stemangiola commented 3 years ago

"Setup coefficient to have same intercept (for simplicity), and zero slope" Are there any other constraints on coefficient? i.e. integer ? Range ?

Execute the code at the homepage of this repository and you will see what coefficients you get for a real dataset. You can get the range from those (except the intercept that should be zero for this test)

stemangiola commented 3 years ago

About integer or not, it is exactly the same. When you do matrix multiplication between design and coefficient is the same.

CastielZhao commented 3 years ago

Hi Stefano,

I have successfully created 100 data frames from my function. To detect the change, do I need to use sccomp library? Or I shall find out a way to do that ?

stemangiola commented 3 years ago

Hi Stefano,

I have successfully created 100 data frames from my function. To detect the change, do I need to use sccomp library? Or I shall find out a way to do that ?

Yes, run sccomp on your data set. See example dataset from github README. Start from a few and try to draw descriptive statistics.

CastielZhao commented 3 years ago

which function in the sccomp is used for detecting variation ?

CastielZhao commented 3 years ago

As I noticed the fuction: res = counts_obj %>% sccomp_glm( ~ type, sample, cell_group, count, approximate_posterior_inference = FALSE ) When analyzing multiple data frames, do I need to merge the data frames, or specifying different data frame by "cell goup " above? Also, type=category, count=count, sample=subject in our dictionary, right?

stemangiola commented 3 years ago

if you analyse different studies no, you analyse them independently. I don't know what you mean by data frames. Data frame can be anything. Please be more precise.

Also, type=category, count=count, sample=subject in our dictionary, right?

yes

CastielZhao commented 3 years ago

if you analyse different studies no, you analyse them independently. I don't know what you mean by data frames. Data frame can be anything. Please be more precise.

Also, type=category, count=count, sample=subject in our dictionary, right?

yes

By data frames, I mean the output simulated data frames from my numeric generation process.

stemangiola commented 3 years ago

one data frame includes M categories and N subjects.

another data frame includes M categories and N subjects.

one subject does constitute a very small dataset that cannot be used for regression, size = 1