I am hoping to use glmGamPoi to identify differentially expressed genes (DEGs) across differing cell types from a linear regression analysis (where Y is continuous rather than discrete). I have come up with the following approach and just want to validate it with you as I couldn't find a linear regression analysis example in the documentation. Note I have been using an approach for this question where Y is discrete (disease vs control cases) which has been working great and just want to verify my approach for linear rather than logistic regression.
First, imagine a dataset with the following columns:
Y - the continuous variable I want to perform the regression analysis on
celltype - A field to identify cell types
patient_ID - Patient identifier, to be used for pseudobulk analysis
sex - Patient sex, we want to account for sex in our design matrix
So we would build the model (fit) as follows, note I leave reference_level as NULL as I would usually set this to Control in a disease control comparison so since this is regression, I assumed this was the correct approach. Can you confirm?:
Hi,
I am hoping to use glmGamPoi to identify differentially expressed genes (DEGs) across differing cell types from a linear regression analysis (where Y is continuous rather than discrete). I have come up with the following approach and just want to validate it with you as I couldn't find a linear regression analysis example in the documentation. Note I have been using an approach for this question where Y is discrete (disease vs control cases) which has been working great and just want to verify my approach for linear rather than logistic regression.
First, imagine a dataset with the following columns:
Y
- the continuous variable I want to perform the regression analysis oncelltype
- A field to identify cell typespatient_ID
- Patient identifier, to be used for pseudobulk analysissex
- Patient sex, we want to account for sex in our design matrixSo we would build the model (
fit
) as follows, note I leavereference_level
asNULL
as I would usually set this toControl
in a disease control comparison so since this is regression, I assumed this was the correct approach. Can you confirm?:Then I would identify DEGs per cell type as follows (running for each cell type):