jslefche / piecewiseSEM

Piecewise Structural Equation Modeling in R
155 stars 49 forks source link

r in emmeans: no variable named X in the reference grid. #218

Open papavientos opened 3 years ago

papavientos commented 3 years ago

Hi, this is my first post here in GitHub!

I would like to know which is the problem with my code as I get this warning when I try to carry a path analysis using piecewiseSEM. I am self taught student and I am performing a path analysis as a recommendation from a colleague. I think it would be better to give you some background to know how it is my data and the question I am trying to addressed with the path analysis.

I study a population of birds in which some of them are not able to get a territory in their first breeding season whereas some other do it. So we can differentiate two reproductive status: floaters (those which don't get a place to breed) and breeders. Our question in this work was if there were any phenotypical traits that may be associated with this reproductive status, so we conducted a logistic regression model for reproductive status dependent on several phenotypical traits (body condition, ornamentation and behaviour). We conducted this logistic regression for adult and nestling measurements. And we found that body condition was positively associated with breeder status: as body condition increases, so does likelihood of becoming a breeder in the population.

So, one workmate recommended me to perform a path analysis for those individuals of which we have both nestling and adult measurements, to know how this relationship between nestling and adult condition would be when accounting for the relationship between nestling condition and adult condition, as they are correlated to some extent.

However, when I build my psem objetc with the paths, formed by a binomial first model and followed by linear models resuming the relationship between other variables, I got the following error:

q1=psem( reproductive.status = glm(status ~ adult_condition + nestling_condition + ornamentation + aggressiveness + screaming, family = "binomial", data = data),

adult.condition = lm(adult_condition ~ nestling_condition, data = data),

feather.length = lm(ornamentation ~ adult_condition, data = data),

nestling.condition = lm(nestling_condition ~ birth_date + brood_size, data = data),

total_pecking%~~%screaming)

summary(q1)

|====================================================== | 54% Error in emmeans::emmeans(model, specs = v, data = data) : No variable named status in the reference grid

I don't know what this error means, but I checked that my binomial variable (status) was coded as a factor and I can't solve this problem. So, I could really use help.

Thank you!

Iraida.

jslefche commented 3 years ago

Hi Iraida, I will take a look at this ASAP but there have been some issues with compatibility with the emmeans package that I have apparently not figured out 😉 Thanks for your patience


Jonathan S. Lefcheck, Ph.D. Tennenbaum Coordinating Scientist MarineGEO: https://marinegeo.si.edu/ Smithsonian Institution Phone: +1 (443) 482-2443 www.jonlefcheck.nethttp://www.jonlefcheck.net

From: papavientosmailto:notifications@github.com Sent: Wednesday, December 30, 2020 4:53 AM To: jslefche/piecewiseSEMmailto:piecewiseSEM@noreply.github.com Cc: Subscribedmailto:subscribed@noreply.github.com Subject: [jslefche/piecewiseSEM] r in emmeans: no variable named X in the reference grid. (#218)

External Email - Exercise Caution

Hi, this is my first post here in GitHub!

I would like to know which is the problem with my code as I get this warning when I try to carry a path analysis using piecewiseSEM. I am self taught student and I am performing a path analysis as a recommendation from a colleague. I think it would be better to give you some background to know how it is my data and the question I am trying to addressed with the path analysis.

I study a population of birds in which some of them are not able to get a territory in their first breeding season whereas some other do it. So we can differentiate two reproductive status: floaters (those which don't get a place to breed) and breeders. Our question in this work was if there were any phenotypical traits that may be associated with this reproductive status, so we conducted a logistic regression model for reproductive status dependent on several phenotypical traits (body condition, ornamentation and behaviour). We conducted this logistic regression for adult and nestling measurements. And we found that body condition was positively associated with breeder status: as body condition increases, so does likelihood of becoming a breeder in the population.

So, one workmate recommended me to perform a path analysis for those individuals of which we have both nestling and adult measurements, to know how this relationship between nestling and adult condition would be when accounting for the relationship between nestling condition and adult condition, as they are correlated to some extent.

However, when I build my psem objetc with the paths, formed by a binomial first model and followed by linear models resuming the relationship between other variables, I got the following error:

q1=psem( reproductive.status = glm(status ~ adult_condition + nestling_condition + ornamentation + aggressiveness + screaming, family = "binomial", data = data),

adult.condition = lm(adult_condition ~ nestling_condition, data = data),

feather.length = lm(ornamentation ~ adult_condition, data = data),

nestling.condition = lm(nestling_condition ~ birth_date + brood_size, data = data),

total_pecking%~~%screaming)

summary(q1)

|====================================================== | 54% Error in emmeans::emmeans(model, specs = v, data = data) : No variable named status in the reference grid

I don't know what this error means, but I checked that my binomial variable (status) was coded as a factor and I can't solve this problem. So, I could really use help.

Thank you!

Iraida.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjslefche%2FpiecewiseSEM%2Fissues%2F218&data=04%7C01%7Clefcheckj%40si.edu%7Cffe8d3a2a8304157a1c808d8aca8b8f7%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C637449187947973606%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gkTTK2fhuGxqgpMJp1t8frVpV8dM8uOHpkF8MWPIJFI%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAR4AVZZLTGX2ZY5IGTW3VDSXL2ARANCNFSM4VOEWE6A&data=04%7C01%7Clefcheckj%40si.edu%7Cffe8d3a2a8304157a1c808d8aca8b8f7%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C637449187947983603%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=UM6mtHliA%2BxoJIPkTUX4uymeq3qCOoBsl9m9alJ9cG4%3D&reserved=0.

rvlenth commented 3 years ago

just looking in. I see that emmeans is used to obtain coefficients for psem models, and things look pretty straightforward for that use, assuming appropriate variables are used. The only issue I have been aware of between emmeans and partialSEM has been the one with cld(), but I think that has been fixed. Let me know if there are other issues you experience.

jslefche commented 3 years ago

Hi Iraida, it would be helpful if you can provide an example code that generates that error, OR if you want to email me a snippet of your actual data. That will allow me to see where the issue is arising

bernard-liew commented 3 years ago

Hi,

I am also experiencing this error. The issue i think pertains when there is a categorical variable.

`set.seed(3) n <- 10 p <- 2 A.eff <- c(40, -15) beta <- -0.45 sigma <- 4 B <- rnorm(n p, 0, 15) A <- gl(p, n, lab = paste("Group", LETTERS[1:2])) mm <- model.matrix(~A + B) data <- data.frame(A = A, B = B, Y = as.numeric(c(A.eff, beta) %% t(mm)) + rnorm(n * p, 0, 4)) data$B <- data$B + 20 head(data)

model <- psem (lm (Y ~ A + B, data = data), lm (B ~ A, data = data), glm (A ~ 1, family = binomial(), data = data))

summary (model)

plot (model) ` Many thanks for helping.

Regards, Bernard

jslefche commented 3 years ago

This seems to be an issue with the way the “floating” variable A is coded. I will have to look into it but in the interim, this code will work:


model <- psem (lm (Y ~ A + B, data = data),
lm (B ~ A, data = data),
A ~ 1)

Jonathan S. Lefcheck, Ph.D. Tennenbaum Coordinating Scientist MarineGEO: https://marinegeo.si.edu/ Smithsonian Institution Phone: +1 (443) 482-2443 www.jonlefcheck.nethttp://www.jonlefcheck.net

From: @.> Sent: Wednesday, June 30, 2021 7:08 AM To: @.> Cc: Lefcheck, @.>; @.> Subject: Re: [jslefche/piecewiseSEM] r in emmeans: no variable named X in the reference grid. (#218)

External Email - Exercise Caution

Hi,

I am also experiencing this error. The issue i think pertains when there is a categorical variable.

`set.seed(3) n <- 10 p <- 2 A.eff <- c(40, -15) beta <- -0.45 sigma <- 4 B <- rnorm(n p, 0, 15) A <- gl(p, n, lab = paste("Group", LETTERS[1:2])) mm <- model.matrix(~A + B) data <- data.frame(A = A, B = B, Y = as.numeric(c(A.eff, beta) %% t(mm)) + rnorm(n * p, 0, 4)) data$B <- data$B + 20 head(data)

model <- psem (lm (Y ~ A + B, data = data), lm (B ~ A, data = data), glm (A ~ 1, family = binomial(), data = data))

summary (model)

plot (model) ` Many thanks for helping.

Regards, Bernard

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjslefche%2FpiecewiseSEM%2Fissues%2F218%23issuecomment-871308623&data=04%7C01%7Clefcheckj%40si.edu%7C933da26c3a914c374e9b08d93bb76332%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C637606481146168543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Ow8rwE9nY2l8XlfWn8xwS2ohfefFVxnr5IDsKPx3D%2F4%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAR4AV73242ISPXXJMZJ3V3TVL3KXANCNFSM4VOEWE6A&data=04%7C01%7Clefcheckj%40si.edu%7C933da26c3a914c374e9b08d93bb76332%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C637606481146168543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IqsjlOcD3kEb25HFiw3o%2B8pZQGcOvPlnPPE7c1JZR9k%3D&reserved=0.

rvlenth commented 3 years ago

FWIW, I traced the summary() call at a point just inside emmeans(), after object is converted to an emmGrid and specs is converted to character.

The emmeans() function is called three times for these three specified models, and the third time I see:

Browse[2]> object
'emmGrid' object with variables:
    1 = 1
Transformation: “logit” 

Browse[2]> specs
[1] "A"

which shows that we have a model with only the intercept as predictors, but yet we are asking for EMMs at each level of A. That's why we get that error message.

For the alternative model2 <- psem (lm (Y ~ A + B, data = data), lm (B ~ A, data = data), A ~ 1), emmeans() is only called twice, and it returns a summary; but also a warning about A being categorical.

I doubt if this adds much or any information, but that's how it looks from the emmeans() side.

maschmoeller commented 2 years ago

Has anyone found a definite solution for this error? I'm running into the same issue building a SEM with one of the models with a binomial response. Have tried many solutions I found online and none worked so far. The model itself runs perfectly, but when included in the psem function it returns the emmeans error.

A sample of my data: data_sample.csv

The "problematic" model is coded as: glmer(main_matrix ~ pastnat_1k + water_min + phikcl_500m_2m + mn2k_N_households + tenure_group + (1 | study_id), data = sample_data, family = binomial, control = glcontrol, na.action = na.pass)

It runs fine separately, but with the code below it returns an error: library(lme4) library(nmle) library(piecewiseSEM) lmcontrol =lmerControl(optCtrl = list(maxeval=10000)) sample_data$main_matrix <- as.factor(sample_data$main_matrix) glcontrol = glmerControl(optimizer = "bobyqa") unif<- psem( lmer(for_rich_final ~ std_effort + Preciptn_seasonlty_1k + elevation_mn_8k + main_matrix + Forest_ca_500m + Forest_np_2k + ngtlg.8k + phikcl_500m_2m + (Forest_ca_500m + Preciptn_seasonlty_1k + elevation_mn_8k + ngtlg.8k| study_id), data = sample_data, control = lmcontrol), lmer(Forest_ca_500m ~ Preciptn_seasonlty_1k + main_matrix + pastnat_1k + fire_YN_2k + tenure_group + water_min + md2k_Income_residents_over_10_yo_w_income + (Preciptn_seasonlty_1k + pastnat_1k + tenure_group + water_min | study_id), data = sample_data, control = lmcontrol), glmer(main_matrix ~ pastnat_1k + water_min + phikcl_500m_2m + mn2k_N_households + tenure_group + (1 | study_id), data = sample_data, family = binomial, control = glcontrol, na.action = na.pass)

summary(unif)

I've tried simplifying the model, but that made no difference, so I don't believe it has anything to do with complexity. Any help there is much appreciated.

rvlenth commented 2 years ago

@maschmoeller If you still have this problem when you simplify the model, then please do so!!! That is, it would be really, really helpful to provide a reproducible example of the simplest possible model you can identify where you encounter this issue. The variable names alone already make this something I don't want to think about.

papavientos commented 2 years ago

Hi, I remember that what I did was to save the dataset as csv in the same session I was doing the models and then load it again before running the models and it magically worked.

Try that just in case!

El vie., 29 jul. 2022 16:52, maschmoeller @.***> escribió:

Has anyone found a definite solution for this error? I'm running into the same issue building a SEM with one of the models with a binomial response. Have tried many solutions I found online and none worked so far. The model itself runs perfectly, but when included in the psem function it returns the emmeans error.

A sample of my data: data_sample.csv https://github.com/jslefche/piecewiseSEM/files/9220943/data_sample.csv

The "problematic" model is coded as: glmer(main_matrix ~ pastnat_1k + water_min + phikcl_500m_2m + mn2k_N_households + tenure_group + (1 | study_id), data = sample_data, family = binomial, control = glcontrol, na.action = na.pass)

It runs fine separately, but with the code below it returns an error: library(lme4) library(nmle) library(piecewiseSEM) lmcontrol =lmerControl(optCtrl = list(maxeval=10000)) sample_data$main_matrix <- as.factor(sample_data$main_matrix) glcontrol = glmerControl(optimizer = "bobyqa") unif<- psem( lmer(for_rich_final ~ std_effort + Preciptn_seasonlty_1k + elevation_mn_8k + main_matrix + Forest_ca_500m + Forest_np_2k + ngtlg.8k + phikcl_500m_2m + (Forest_ca_500m + Preciptn_seasonlty_1k + elevation_mn_8k + ngtlg.8k| study_id), data = sample_data, control = lmcontrol), lmer(Forest_ca_500m ~ Preciptn_seasonlty_1k + main_matrix + pastnat_1k + fire_YN_2k + tenure_group + water_min + md2k_Income_residents_over_10_yo_w_income + (Preciptn_seasonlty_1k + pastnat_1k + tenure_group + water_min | study_id), data = sample_data, control = lmcontrol), glmer(main_matrix ~ pastnat_1k + water_min + phikcl_500m_2m + mn2k_N_households + tenure_group + (1 | study_id), data = sample_data, family = binomial, control = glcontrol, na.action = na.pass)

summary(unif)

I've tried simplifying the model, but that made no difference, so I don't believe it has anything to do with complexity. Any help there is much appreciated.

— Reply to this email directly, view it on GitHub https://github.com/jslefche/piecewiseSEM/issues/218#issuecomment-1199451340, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASJNI6V3F257TMJ2H4LTESDVWPV2VANCNFSM4VOEWE6A . You are receiving this because you authored the thread.Message ID: @.***>

maschmoeller commented 2 years ago

@maschmoeller If you still have this problem when you simplify the model, then please do so!!! That is, it would be really, really helpful to provide a reproducible example of the simplest possible model you can identify where you encounter this issue. The variable names alone already make this something I don't want to think about.

Yes, @rvlenth , you are absolutely right. I also realized my sample data was probably too small for anyone to try to reproduce my error, so I attach another. data_sample.csv And here is a simplified version of the script that still runs into the same error (although the progress bar does go further than with the more complex versions):

m1 <- psem( lmer(richness ~ matrix + cover + (1| id), data = data), lmer(cover ~ matrix + (1| id), data = data), glmer(matrix ~ past_land + (1 | id), data = data, family = binomial) ) summary(m1)

maschmoeller commented 2 years ago

Yes, @papavientos ,I saw that answer to your post and tried. Even tried reloading on different computers. No luck! Thanks for the answer!

papavientos commented 2 years ago

I will take a look at it tomorrow and hope to help you :-)

El vie., 29 jul. 2022 19:25, maschmoeller @.***> escribió:

Hi, I remember that what I did was to save the dataset as csv in the same session I was doing the models and then load it again before running the models and it magically worked. Try that just in case! El vie., 29 jul. 2022 16:52, maschmoeller @.

*> escribió: … <#m-8283791042219942142> Has anyone found a definite solution for this error? I'm running into the same issue building a SEM with one of the models with a binomial response. Have tried many solutions I found online and none worked so far. The model itself runs perfectly, but when included in the psem function it returns the emmeans error. A sample of my data: data_sample.csv https://github.com/jslefche/piecewiseSEM/files/9220943/data_sample.csv https://github.com/jslefche/piecewiseSEM/files/9220943/data_sample.csv The "problematic" model is coded as: glmer(main_matrix ~ pastnat_1k + water_min + phikcl_500m_2m + mn2k_N_households + tenure_group + (1 | study_id), data = sample_data, family = binomial, control = glcontrol, na.action = na.pass) It runs fine separately, but with the code below it returns an error: library(lme4) library(nmle) library(piecewiseSEM) lmcontrol =lmerControl(optCtrl = list(maxeval=10000)) sample_data$main_matrix <- as.factor(sample_data$main_matrix) glcontrol = glmerControl(optimizer = "bobyqa") unif<- psem( lmer(for_rich_final ~ std_effort + Preciptn_seasonlty_1k + elevation_mn_8k + main_matrix + Forest_ca_500m + Forest_np_2k + ngtlg.8k + phikcl_500m_2m + (Forest_ca_500m

Yes, I saw that answer to your post and tried. Even tried reloading on different computers. No luck! Thanks for the answer!

— Reply to this email directly, view it on GitHub https://github.com/jslefche/piecewiseSEM/issues/218#issuecomment-1199780029, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASJNI6WUKE3TKSKILQPJD3TVWQHXJANCNFSM4VOEWE6A . You are receiving this because you authored the thread.Message ID: @.***>

papavientos commented 2 years ago

Hi!

I made a dummy variable of the variable "matrix" with Agro_Pasture == 1 and Other == 0 and then replaced all the terms 'matrix' for dummy and it worked perfectly! Let me know if it makes sense to you :-)

MODEL <- psem( M1<- lmer(richness ~ dummy + cover + (1| id), data = data), M2<- lmer(cover ~ dummy + (1| id), data = data), M3 <- glmer(dummy ~ past_land + (1| id), data = data, family = binomial)) summary(MODEL)

model

maschmoeller commented 2 years ago

That worked! Thank you @papavientos !