const-ae / glmGamPoi

Fit Gamma-Poisson Generalized Linear Models Reliably
103 stars 14 forks source link

results tables identical for all cell populations #62

Closed lichtobergo closed 5 months ago

lichtobergo commented 5 months ago

Hello, I have a problem which I think per se has nothing to do with bugs in the package but more with me not knowing enough about statistics combined with user mistakes. But I don't know where else to turn to. So if this issue is not appropriate, feel free to just close it.

I was trying to do DE analysis of single-nucleus data from CNS with glmGamPoi. I did standard pre-processing and cell type annotation. I have 3 treatment groups and 2 different tissues (spinal cord, brain) which I combined into one group variable (exp_group) I made the pseudobulk object as follows:

reduced_sce <- glmGamPoi::pseudobulk(
  data = sce_anno,
  group_by = vars(sample.name, condition = exp_group, celltype.main), # sample.name is the replicate
  n_cells = n()
)

The the fitting of the model

fit <- glm_gp(
  data = reduced_sce,
  design = ~ condition + celltype.main, 
  size_factors = "deconvolution",
  verbose = TRUE
)

And then did the DE test:

test_de(
    fit = fit, 
    contrast = cond(celltype.main = "Neuron", condition = "GM.TovaCar.AAV") - cond(celltype.main = "Neuron", condition = "GM.Tova.AAV"))

When I looked at the results I noticed that all table were exactly identical for all the cell types. My question is now where is my mistake? I would very much appreciate any help.

Best, Michael

const-ae commented 5 months ago

Hi Michael,

thanks for reaching out. The problem is that you specify the design with a +. This means there will be one coefficient for the condition and one for each cell type. If I understood your description, you, however, care about the cell type-specific condition effect so that there is one coefficient for each celltype and condition pair. This is called an interaction effect and you specify it with a * (i.e., the design is ~ condition * celltype.main).

Otherwise, everything looks good :)

Best, Constantin

lichtobergo commented 5 months ago

Hi Constantin, Thank you for your response! Of course, that was my mistake. As I assumed, lacking stats knowledge was the cause. Now I get results that make much more sense after I removed cell types which were not present in all conditions.

Otherwise, everything looks good :)

Best, Constantin

Of course, as I was following your example quite closely :) Thanks again, Constantin! Best, Michael

const-ae commented 5 months ago

Happy to help :)