jslefche / piecewiseSEM

Piecewise Structural Equation Modeling in R
156 stars 49 forks source link

standardized coefficients missing when component model is a glm.nb #232

Closed laurenpetrullo closed 2 years ago

laurenpetrullo commented 3 years ago

Hi! I'm new to the package and loving the functionality. One of my SEMs has a negative binomial component model whose variables are turning up empty in the standardized coefficient column of the summary() and coefs() output. I am not sure if I'm missing something about how to implement this into the psem() function, or if anyone else has come up with an efficient way to calculate standardized betas for glm.nb component models within their overall SEM? Thanks in advance for any help!

Gangrelino commented 2 years ago

Hi @laurenpetrullo did you find a solution to this issue?

Eliot-RUIZ commented 2 years ago

Hi, I just came upon the same issue. I fixed it by setting the argument standardize.type to "Menard.OE"; because the default link function of glm.nb cannot be probit or logit used with the method "latent.linear". I also included the class "negbin" to the function GetSDy(), to run scaleGLM(), instead of setting sd.y to NA.

GetSDy <- function(model, data, standardize = "scale", standardize.type = "Menard.OE") { ##### MODIFIED TO ADD CLASS NEGBIN

  vars <- all.vars.merMod(model)

  y <- vars[1]

  family. <- try(family(model), silent = TRUE)

  if(class(family.) == "try-error") family. <- try(model$family, silent = TRUE)

  if(class(family.) == "try-error" | is.null(family.) & all(class(model) %in% c("sarlm", "gls", "lme")))

    family. <- list(family = "gaussian", link = "identity")

  if(class(family.) == "try-error" | is.null(family.) | any(class(model) %in% c("glmerMod", "glmmPQL")))

    sd.y <- NA else {

      if(family.$family == "gaussian") {

        if(all(standardize == "scale")) sd.y <- sd(data[, y], na.rm = TRUE) else

          if(all(standardize == "range")) sd.y <- diff(range(data[, y], na.rm = TRUE)) else

            if(is.list(standardize)) {

              nm <- which(names(standardize) == y)

              if(sum(nm) == 0) {

                warning(paste0("Relevant range not specified for variable '", y, "'. Using observed range instead"), call. = FALSE)

                sd.y <- diff(range(data[, y], na.rm = TRUE))

              } else sd.y <- diff(range(standardize[[nm]]))

            }

      } else if(family.$family == "binomial" || any(class(model) == "negbin")) ############### CLASS NEGBIN ADDED 

      sd.y <- scaleGLM(model, standardize, standardize.type) else

        sd.y <- NA

    }

  return(sd.y)

}
jslefche commented 2 years ago

Thanks, will have to implement this in future revisions


Jonathan S. Lefcheck, Ph.D. Tennenbaum Coordinating Scientist MarineGEO: https://marinegeo.si.edu/ Smithsonian Institution Phone: +1 (443) 482-2443 www.jonlefcheck.nethttp://www.jonlefcheck.net

From: @.> Sent: Thursday, April 7, 2022 8:10 AM To: @.> Cc: @.***> Subject: Re: [jslefche/piecewiseSEM] standardized coefficients missing when component model is a glm.nb (#232)

External Email - Exercise Caution

Hi, I just came upon the same issue. I fixed it by setting the argument standardize.type to "Menard.OE"; because the default link function of glm.nb cannot be probit or logit used with the method "latent.linear". I also included the class "negbin" to the function GetSDy(), to run scaleGLM(), instead of setting sd.y to NA.

GetSDy <- function(model, data, standardize = "scale", standardize.type = "Menard.OE") { ##### MODIFIED TO ADD CLASS NEGBIN

vars <- all.vars.merMod(model)

y <- vars[1]

family. <- try(family(model), silent = TRUE)

if(class(family.) == "try-error") family. <- try(model$family, silent = TRUE)

if(class(family.) == "try-error" | is.null(family.) & all(class(model) %in% c("sarlm", "gls", "lme")))

family. <- list(family = "gaussian", link = "identity")

if(class(family.) == "try-error" | is.null(family.) | any(class(model) %in% c("glmerMod", "glmmPQL")))

sd.y <- NA else {

  if(family.$family == "gaussian") {

    if(all(standardize == "scale")) sd.y <- sd(data[, y], na.rm = TRUE) else

      if(all(standardize == "range")) sd.y <- diff(range(data[, y], na.rm = TRUE)) else

        if(is.list(standardize)) {

          nm <- which(names(standardize) == y)

          if(sum(nm) == 0) {

            warning(paste0("Relevant range not specified for variable '", y, "'. Using observed range instead"), call. = FALSE)

            sd.y <- diff(range(data[, y], na.rm = TRUE))

          } else sd.y <- diff(range(standardize[[nm]]))

        }

  } else if(family.$family == "binomial" || any(class(model) == "negbin")) ############### CLASS NEGBIN ADDED

  sd.y <- scaleGLM(model, standardize, standardize.type) else

    sd.y <- NA

}

return(sd.y)

}

— Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjslefche%2FpiecewiseSEM%2Fissues%2F232%23issuecomment-1091655770&data=04%7C01%7Clefcheckj%40si.edu%7C4a4b49aca7384360c46708da188f692a%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C637849302245232184%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=1ybk1TfX1Vz%2F8zJ9U13qfbgU69gkK1X5qoTTPsROoKE%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAR4AV6BCIQJNCSA7ZK4QMTVD3F6BANCNFSM42RE7NHA&data=04%7C01%7Clefcheckj%40si.edu%7C4a4b49aca7384360c46708da188f692a%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C637849302245232184%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=SxZYSLvMvTP1jwkM6YJdH92yVlr%2BCh%2FMjThpsI7%2Byh0%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>

jslefche commented 2 years ago

Actually just a silly error in matching the name returned from family . Just had to rename to "Negative Binomial" and all should work now!

fezoz commented 2 years ago

Hi,

I was wondering if this bug also applies to Poisson distribution glms or am I missing something? I have a model such as this:

psem(
'model1'<-lm(y1~x1)
'model2'<-lm(y2~y1)
'model3'<-glm(y3~y2+y1,family=poisson(link="log"))
)

Summary then show Std.Estimate for all variables except y3, which is the one with Poisson distribution.

thank you!

ghost commented 2 years ago

Hi, It seems the summary function is not returning the Std.Estimate of models built under the MASS::glm.nb() function. Any idea how to access them ? Thanks !

frannasc commented 1 year ago

Hi, I get the same problem with glm with gamma distribution... psem( 'model1'<-lm(y1~x1) 'model2'<-lm(y2~y1) 'model3'<-glm(y3~y2+y1,family=gamma(link="log")) ) If there anything I can do to solve this?

jslefche commented 1 year ago

Gamma family is not currently supported but may in future updates. Alas, you would have to calculate by hand if required ☹


Jonathan S. Lefcheck, Ph.D. Tennenbaum Coordinating Scientist MarineGEO: https://marinegeo.si.edu/ Smithsonian Institution Phone: +1 (443) 482-2443 www.jonlefcheck.nethttp://www.jonlefcheck.net

From: @.> Sent: Thursday, February 9, 2023 11:36 AM To: @.> Cc: Lefcheck, @.>; State @.> Subject: Re: [jslefche/piecewiseSEM] standardized coefficients missing when component model is a glm.nb (#232)

External Email - Exercise Caution

Hi, I get the same problem with glm with gamma distribution... psem( 'model1'<-lm(y1x1) 'model2'<-lm(y2y1) 'model3'<-glm(y3~y2+y1,family=gamma(link="log")) ) If there anything I can do to solve this?

— Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjslefche%2FpiecewiseSEM%2Fissues%2F232%23issuecomment-1424481122&data=05%7C01%7Clefcheckj%40si.edu%7C38fcd2f315164b6500ed08db0abbca3e%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638115573887968387%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=FVT9%2BkE7wGAGsuv7xIJoHjdaGlsRjbQjaMLc1y5RppU%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAR4AV53S7QGGVMGCQABLW3WWUMIVANCNFSM42RE7NHA&data=05%7C01%7Clefcheckj%40si.edu%7C38fcd2f315164b6500ed08db0abbca3e%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638115573887968387%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GO3vhOr1Yrc614xrTxmtL5ttiqWlA0XCSR1UcGoeKQM%3D&reserved=0. You are receiving this because you modified the open/close state.Message ID: @.***>

Gangrelino commented 1 year ago

You might be able to put your own distribution families into the function by modifying the function that are called. The vignettes mention which functions indicate which functions need to be changed

On Feb 9, 2023, at 12:32 PM, Jon Lefcheck @.***> wrote:

 Gamma family is not currently supported but may in future updates. Alas, you would have to calculate by hand if required ☹


Jonathan S. Lefcheck, Ph.D. Tennenbaum Coordinating Scientist MarineGEO: https://marinegeo.si.edu/ Smithsonian Institution Phone: +1 (443) 482-2443 www.jonlefcheck.nethttp://www.jonlefcheck.net

From: @.> Sent: Thursday, February 9, 2023 11:36 AM To: @.> Cc: Lefcheck, @.>; State @.> Subject: Re: [jslefche/piecewiseSEM] standardized coefficients missing when component model is a glm.nb (#232)

External Email - Exercise Caution

Hi, I get the same problem with glm with gamma distribution... psem( 'model1'<-lm(y1x1) 'model2'<-lm(y2y1) 'model3'<-glm(y3~y2+y1,family=gamma(link="log")) ) If there anything I can do to solve this?

— Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjslefche%2FpiecewiseSEM%2Fissues%2F232%23issuecomment-1424481122&data=05%7C01%7Clefcheckj%40si.edu%7C38fcd2f315164b6500ed08db0abbca3e%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638115573887968387%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=FVT9%2BkE7wGAGsuv7xIJoHjdaGlsRjbQjaMLc1y5RppU%3D&reserved=0, or unsubscribehttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAR4AV53S7QGGVMGCQABLW3WWUMIVANCNFSM42RE7NHA&data=05%7C01%7Clefcheckj%40si.edu%7C38fcd2f315164b6500ed08db0abbca3e%7C989b5e2a14e44efe93b78cdd5fc5d11c%7C0%7C0%7C638115573887968387%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=GO3vhOr1Yrc614xrTxmtL5ttiqWlA0XCSR1UcGoeKQM%3D&reserved=0. You are receiving this because you modified the open/close state.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.

frannasc commented 1 year ago

Thank you for your quick answers! @Gangrelino I am not sure I understand what you mean. Would it be possible to give an example?

Gangrelino commented 1 year ago

Check out the comments above by @Eliot-RUIZ: https://github.com/jslefche/piecewiseSEM/issues/232#issuecomment-1091655770

Also check this:

https://github.com/jslefche/piecewiseSEM#to-add-a-new-model-class

You will have to modify the internal functions to include the distribution family you want. The issue is that internally piecewiseSEM use a set of functions to return standardized stuff. You could try adding a method for the gamma family; the output should give you the std estimates after u do that.

On Feb 9, 2023, at 1:13 PM, frannasc @.***> wrote:

 Thank you for your quick answers! @Gangrelino I am not sure I understand what you mean. Would it be possible to give an example?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

hanlunliu commented 1 year ago

I also got problem of no standardised estimates when I applied a glm, and I found problem isn't about family or method of glm at all. I got it because I subset my data, and the rownames (in number) of my input data aren't in a correct order corresponding to the row numbers, and that created a problem in the "R <- cor(data[as.numeric(names(preds2)), y], preds2)" of the following function. I think I won't be the only one with this case. And it would be quite easy to solve it by adding something like "rownames(data) <- 1:nrow(data)" in this function.

scaleGLM <- function(model, family., link., standardize = "scale", standardize.type = "latent.linear") {

  if(family. != "binomial") standardize.type <- "Menard.OE"

  preds <- predict(model, type = "link")

  if(standardize.type == "latent.linear") {

    if(family. != "binomial") 

      sd.y <- sqrt(var(preds)) else

        if(family. == "binomial") {

          if(link. == "logit") sigmaE <- pi^2/3 else

            if(link. == "probit") sigmaE <- 1

            sd.y <- sqrt(var(preds) + sigmaE) 

        }

  }

  if(standardize.type == "Menard.OE") {

    y <- all.vars_notrans(model)[1]

    data <- GetSingleData(model)

    preds2 <- predict(model, type = "response")

    if(is.null(names(preds2))) names(preds2) <- rownames(data)

    R <- cor(data[as.numeric(names(preds2)), y], preds2)

    sd.y <- sqrt(var(preds)) / R

  }

  if(all(standardize == "range") | is.list(standardize)) sd.y <- sd.y * 6

  return(sd.y)

}