kaz-yos / regmedint

R implementation of effect measure modification-extended regression-based closed-formula causal mediation analysis
https://kaz-yos.github.io/regmedint/
26 stars 5 forks source link

Categorical exposure using regmedint #4

Open Cmaj7sharp11 opened 3 years ago

Cmaj7sharp11 commented 3 years ago

Hi, using regmedint is it possible to compute the effects comparing two levels of a categorical exposure variable which has more than two categories? Thanks.

kaz-yos commented 3 years ago

You need to extract two groups of interest as of now.

Nume22 commented 2 years ago

Dear Kazuki, Thank you for sharing your great knowledge. I read the tutorial paper which was published last month. Is it not possible to compute the effects of each category of exposure variables that are more than 3, still now? I want to use categorical exposure variables as is. I put categorical exposure without dummies in the model. I prepared the exp_level for each exposure category and I used exp_level as a1 in the function regmedint. Mediator and outcome are binary. For covariates, I prepared dummies. I worry about whether it is ok or not. Do I need to prepare dummies for exposure? I look forward to hearing back from you. Thanks,
Num

einsley1993 commented 2 years ago

@Nume22 Did you mean you have an exposure variable that has 3 levels? To create a causal contrast, you would need to compare two exposure levels. There is no way to compare three levels simultaneously. Or did you mean anything else by "exposure category"?

Nume22 commented 2 years ago

Dear Yi Li,

Thank you for your reply.

That's correct. My question is whether the package can process categorical exposures (inbuilt dummification process).

For example, exposure levels are 1 2 3 4, we use 1 as the reference, and contrast every other level against 1.

Thank you, Num

einsley1993 commented 2 years ago

@Nume22 Then you need to compare one other level vs. level 1 at a time, and fit regmedint() function three times (for levels 2, 3, and 4, respectively). The package itself doesn't have built-in dummification process, and so you would need to do this step manually before fitting regmedint().

Nume22 commented 2 years ago

Thank you for your prompt reply. I will try it Thank you, Num

lijiaqi-github commented 1 year ago

@einsley1993 Dear Dr. Li,

I frequently encounter situations involving categorical exposure. Based on your answers, I have come to understand that exposure is better suited as a binary variable, and when exposure has multiple levels, it requires variable processing.

My question is, should the mediators be a binary variable (not 3 or more levels), right? How should I proceed when I have a mediator with multiple levels? using dummy variables?

And I want to confirm my understanding of the situations when I have a categorical exposure with multiple levels. For example, if the exposure "x" has 3 levels (0, 1, and 2), I need to create dummy variables such as... x dummy_x2 dummy_x3 # x = 0 is the reference 1 1 0
2 0 1

How should I input the dummy variable? Should I run "regmedint" with the argument "avar = dummy_x2" and next "avar = dummy_x3" one by one? When running with "avar = dummy_x2", should I include "dummy_x3" as an adjusted variable?

Or is the best way to convert categorical variables into dichotomous variables?

Thank you in advance.

einsley1993 commented 1 year ago

@lijiaqi-github Yes, the exposure (avar) is better for binary and continuous variables than categorical ones (>=3 levels).

The mediator can only be either binary or continuous variable, because the supported model is either linear or logistic, no others.

The dummy variables are only for covariates (cvar and the corresponding c_cond), and for those, you will need to manually create dummy variables, e.g. cvar = c("dummy_c1", "dummy_c2", ..., "dummy_c5"), and the corresponding c_cond = c(1, 0, ..., 0). avar and mvar don't take dummy variables.

Whether converting the categorical variables into binary variables depends on the contrast you want to make. Re the scenario you described, if exposure has 3 levels, but it cannot be treated as continuous (assuming linear effect) or merged to 2 levels (binary variable), it is not recommended to simply adjust for "dummy_x3" as an adjusted variable, because "x" is indeed the exposure, not the confounders. So in this case, the regmedint package doesn't support this kind of modelling.

lijiaqi-github commented 1 year ago

Thank you very much for your prompt reply, Dr. li.@einsley1993

That is to say, the regmedint package is unable to handle exposure with multiple levels, except for converting it into dichotomous variables.

In fact, I have been searching recently to find a package or SAS macro that can be used for conducting mediation analysis with a categorical exposure variable. However, most of the method papers and available packages/SAS macros I found do not support this type of analysis (basically support dichotomous exposure).

Since I am not well-versed in mathematical statistics, I am wondering if it is to say that mediation analysis to date only addresses dichotomous exposure variables in survival settings. Or, do you know if there are any methods or packages capable of handling mediation analysis with categorical exposure.

Besides(Sorry for asking so many questions), if the exposure is continue, how to set the a0 =, a1 =, and m_cde =? For example, exposure ranges from 0 to 10 (0 is the reference), and the mediator ranges from 1 to 5(1 is the reference), how to set the a0 =, a1 =, and m_cde =?

Thank you again for your kind help.

einsley1993 commented 1 year ago

@lijiaqi-github The R packages that support more types of exposure and mediator variables are 'medflex' and 'mediation'. They can handle categorical exposures and mediators, but the underlying estimation methods are different from Valeri & VanderWeele (2013), so they are not equivalent to the SAS macro.

Re your question about continuous exposure, a0 and a1 are two levels you want to compare. For example, you can compare the BMI at 25th percentile and the 75th percentile, if BMI is the exposure. m_cde is the level of mediator you want to condition on for CDE estimation. These levels all depend on what levels are of interest in the subject matter.

lijiaqi-github commented 1 year ago

Thank you very much, Dr. li.@einsley1993 I have tried the 'medflex' and 'mediation' packages. The 'medflex' seems not support the survival setting and 'mediation' has errors when using survreg function. as mentioned in section 8, additional functionalities for dealing with exposure-induced confounding and multiple mediators are intended to be added to the package in the future, as well as extensions for survival models.

For continuous exposure, I can compare two values of continuous exposure (e.g., BMI=20 and BMI=25). It seems like selecting two values of continuous exposure and converted it into a binary variable. So, the binary variable or similar variable is the main choice for your package, is it right?

Thank you very much. I learn a lot.

einsley1993 commented 1 year ago

@lijiaqi-github Comparing two levels of exposure is not the same as converting the exposure to binary variables – you just pick 2 levels for comparison, but you are still using the continuous exposure levels in the original dataset. Also, it doesn't mean that you only compare those 2 subgroups whose exposure equals the 2 levels you selected – you still use the whole dataset, instead of excluding people with exposure's level other than the 2 levels you select. It is the same as in the general causal studies, where you compare the potential outcome under a = 1 and a = 0 (i.e. making the whole people exposed vs. making the whole population unexposed), not the observed association (E[Y|A=1] vs. E[Y|A=0]).

lijiaqi-github commented 1 year ago

Thank you for explaining carefully, Dr. li.@einsley1993. I now understand. I apologize for my earlier incorrect statement. I just found that tableone is also your group work. It also great, looking forward to the next work of you and your colleagues.