AlexiaJM / LEGIT

An R package for the Latent Environmental & Genetic InTeraction (LEGIT) model
GNU General Public License v3.0
12 stars 2 forks source link

Issue with GxE_interaction_test and lme4 - same AIC/BIC for competing models #5

Open RobbieRECAP opened 6 days ago

RobbieRECAP commented 6 days ago

Hi there,

Sorry, this is my first time posting an issue for an R package so I am unsure what is standard protocol. If you need me to submit a full markdown file then I could probably do that.

I am trying to use the GxE_interaction_test() command. I have combined data from 4 different cohorts. I have one measure of the environment and one measure of their temperament (the "genes" component for my analysis).

Originally I used GxE_interaction_test() without accounting for cohort effects (first image- this all looks fine to me). This is partially because I had standardised all the cohort's data before pooling so it had a mean of 0 and an SD of 1 but also because I couldn't get the lme4=TRUE option to work correctly. Now I have had feedback from peer reviewers where they are asking for a linear mixed model.

The problem I have is that when using LME4=TRUE I get the exact same AIC/BIC values for multiple competing models (second image), which I do not think should be possible. In particular, it looks like the main effects of the genes and environment are being included in all models as the intercept-only model/no interaction models have the same AIC/BIC values as the G+E model, and it seems to be saying that when I look at the coeficients for these models (third image).

Am I doing something wrong in the formulation of the formula? I have tried modifying the specification of the random effects- I would ideally only test for random slopes by cohort- and I have tried not including the covariates but I keep getting the same AIC for all models not including interactions when using LME4.

Any help would be much appreciated!

Rob

Anmerkung 2024-09-17 095855

Anmerkung 2024-09-17 100007

Anmerkung 2024-09-17 110523

AlexiaJM commented 6 days ago

Hi Rob,

Yes this ok, there is no protocol, you can ask your questions here.

The issue comes is because you include "G*E" into your formula F. As the name state, you need to provide a "formula_noGxE", and here the "G*E" term causes R to include both G, G:E, and E terms. Hence why the models that add E or G to the formula don't change anything and all BICs are the same.

Here G*E means G + E + G*E so this what you use for the lme4 random effects, and you can keep it like this. Just make sure to remove the "GE" from the fixed effects so instead of "y tilda G*E + (0+G*E | blabla)" do "y tilda 1+ (0+G*E | blabla)" or "y tilda (0+G\E | blabla)".

RobbieRECAP commented 3 days ago

Hi Alex,

Thanks very much for your help, that did the trick! I do have another question because the results I am getting using the new methodology look a bit strange to me. In the original analyses, I could use LEGIT's GxE_interaction_test and then run a normal linear regression/produce simple slopes and Johnson Neyman plots and everything really looked like it matched up well. For example here is an original analysis that LEGIT thought demonstrated diathesis stress. This matches up well with the linear regression and the resulting Johnson-Neyman plot.

Original analysis using multiple regression: image image

New analysis using LME4 (random slopes for G, E, and G:E) I have specified the random slopes as desired and the GxE_interaction_test is now suggesting evidence for differential susceptibility, with the crossover at 0.4. I find this very surprising because when I run a linear mixed model in LME4 with random slopes and then plot the simple slopes/Johnson-Neyamn plots then it looks like to me I have no crossover and so evidence for either E only or diathesis stress. Any ideas on why the conclusions seem to now differ so drastically? Anything I should look to change with the specification of random effects or how GxE_interaction_test decides which optimal model is supported?

image image image

Comparison of regression models (left original regression, right LME4 with random slopes)

image

One final thing I noticed about the new analyses: in the two differential susceptibility models (strong and weak), both are saying that about 60% of observations are below the crossover, I am not sure whether that is logical based on the fact that one is at 0.4 and the other is at -0.16. When I look at my environmental factor, 0.4 would match up well with 60% but not -0.16

image

AlexiaJM commented 3 days ago

Hi Rob,

Its a bit weird to have G, E, and G:E in the random effects when you have a different parametrization in the fixed effects, but I am not an expert in mixed models, so I can't say for sure if its okay. You could try with just G:E and E instead of G*E but i don't know.

It's also possible that the lme4 interfere with things somehow. What you could do in this case is remove lme4 models which have a non-significant GxE. There are ways to get p-values from these models given the lme 4 model fit. Or I would recommend using BIC, because AIC is way too lenient at allowing extra variables even when they contribute very little.