Open hugoeira opened 9 months ago
Hello,
Very good questions--
Yes, the boundary error has to do with the small variance components of the random effects. You could respecify or remove
NA's can be problematic if you end up fitting regressions to vastly different subsets of the model. However, this is just a warning and as you see, provides output. If the missing data are minimal (as in your case) you may feel comfortable to ignore, or impute using some other method (eg, random forests)
You are correct that you cannot recover a Fisher's C or Chi-squared when the model contains all paths (is "fully saturated"). In this case, you could respecify the model structure to remove paths or rely on other, model-specific indicators of fit, such as R^2s
HTH,
Jon
Jonathan Lefcheck, Ph.D.
Research Scientist
Integration and Application Network
University of Maryland Center for Environmental Science
www.jonlefcheck.nethttp://www.jonlefcheck.net
From: Hugo Eira @.> Sent: Tuesday, September 26, 2023 9:20 AM To: jslefche/piecewiseSEM @.> Cc: Subscribed @.***> Subject: [jslefche/piecewiseSEM] model inplementation Fisher's C = NA (Issue #291)
Hi,
Sorry to bother with might be a basic question.
I am implementing a SEM model, which seems straightforward to me. Nevertheless the results are bit weird.
model_ha <- psem( lmer( bci ~ shannon + cort + ha + (1|ID), data = metadata), lmer(ha ~ shannon + cort + ( 1|ID), data = metadata), lmer(cort ~ shannon + (1|ID), data = metadata), data= metadata)
I have two repeated measure and thus controlling for individual ID. I have 43 individuals sampled twice (I am guessing the error that I am encountering is from an overfitted model).
summary(model_ha)
First error:
boundary (singular) fit: see help('isSingular') Warning message: NAs detected in the dataset. Consider removing all rows with NAs to prevent fitting to different subsets of data
This I understand: there is almost no variance coming from the random effects. I have also some missing data in 3 or 4 samples, which I think it should be a major problem. Is there a way to tell piecewise SEM how to deal with missing data?
Second error:
Call: bci ~ shannon + cort + ha ha ~ shannon + cort cort ~ shannon
AIC
648.655
Tests of directed separation:
No independence claims present. Tests of directed separation not possible.
-- Global goodness-of-fit:
Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom Fisher's C = NA with P-value = NA and on 0 degrees of freedom
The fact that there are no indepence claims (I am guessing) it's because I fitted all the possible paths to the model, right?
What I really do not understand is why Fisher's C cannot be calculated??? Is it model overfit? Low sample size? Or is it something with the model description itself?
Thanks in advance. Kind regards, Hugo
— Reply to this email directly, view it on GitHubhttps://www.google.com/url?q=https://github.com/jslefche/piecewiseSEM/issues/291&source=gmail-imap&ust=1696339206000000&usg=AOvVaw2x9LX0fuZYq8ZbjMgXwFWY, or unsubscribehttps://www.google.com/url?q=https://github.com/notifications/unsubscribe-auth/AAR4AV4QY3PKVFMGZLKCEMLX4LJALANCNFSM6AAAAAA5HW4C4A&source=gmail-imap&ust=1696339206000000&usg=AOvVaw0AnOPgQXgKbWJMm87YpaAs. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi, Thanks so much for the quick response.
"If the missing data are minimal (as in your case) you may feel comfortable to ignore, or impute using some other method (eg, random forests)"
Can this be done in piecewiseSEM or do I need to do it before inputting the data?
"You are correct that you cannot recover a Fisher's C or Chi-squared when the model contains all paths (is "fully saturated"). In this case, you could respecify the model structure to remove paths or rely on other, model-specific indicators of fit, such as R^2s"
I did remove some of the paths just to test it out, but the results are the same:
Structural Equation Model of model_ha
Call:
bci ~ shannon + cort + ha
ha ~ shannon + cort
AIC
402.976
---
Tests of directed separation:
No independence claims present. Tests of directed separation not possible.
--
Global goodness-of-fit:
Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom
Fisher's C = NA with P-value = NA and on 0 degrees of freedom
Individual R-squared:
Response method Marginal Conditional
std_bci_two none 0.18 0.42
std_ha none 0.38 0.38
Any ideas why?
Cheers, Hugo
Hi Hugo,
piecewiseSEM does not perform the imputation but there are other packages that can assist (eg, rfImpute)
Hmm, not sure why after removing paths you do not get a goodness-of-fit statistic, but if you share your code I can look
Cheers,
Jon
Jonathan Lefcheck, Ph.D.
Research Scientist
Integration and Application Network
University of Maryland Center for Environmental Science
www.jonlefcheck.nethttp://www.jonlefcheck.net
From: Hugo Eira @.> Sent: Tuesday, September 26, 2023 9:37 AM To: jslefche/piecewiseSEM @.> Cc: Jon Lefcheck @.>; Comment @.> Subject: Re: [jslefche/piecewiseSEM] model inplementation Fisher's C = NA (Issue #291)
Hi, Thanks so much for the quick response.
"If the missing data are minimal (as in your case) you may feel comfortable to ignore, or impute using some other method (eg, random forests)"
Can this be done in piecewiseSEM or do I need to do it before inputting the data?
"You are correct that you cannot recover a Fisher's C or Chi-squared when the model contains all paths (is "fully saturated"). In this case, you could respecify the model structure to remove paths or rely on other, model-specific indicators of fit, such as R^2s"
I did remove some of the paths just to test it out, but the results are the same:
Structural Equation Model of model_ha
Call: bci ~ shannon + cort + ha ha ~ shannon + cort
AIC
402.976
Tests of directed separation:
No independence claims present. Tests of directed separation not possible.
-- Global goodness-of-fit:
Chi-Squared = 0 with P-value = 1 and on 0 degrees of freedom Fisher's C = NA with P-value = NA and on 0 degrees of freedom
Individual R-squared:
Response method Marginal Conditional
std_bci_two none 0.18 0.42 std_ha none 0.38 0.38
Any ideas why?
Cheers, Hugo
— Reply to this email directly, view it on GitHubhttps://www.google.com/url?q=https://github.com/jslefche/piecewiseSEM/issues/291%23issuecomment-1735560985&source=gmail-imap&ust=1696340252000000&usg=AOvVaw2mFx5r_P03aFbOfkdA48Ek, or unsubscribehttps://www.google.com/url?q=https://github.com/notifications/unsubscribe-auth/AAR4AV2Y3B3363Q36JSEP3LX4LLBVANCNFSM6AAAAAA5HW4C4A&source=gmail-imap&ust=1696340252000000&usg=AOvVaw0o8Imb-ZHyZpjrLZIGWIae. You are receiving this because you commented.Message ID: @.***>
Hi Jonathan,
Maybe this can be a way to work around it.
Since piecewiseSem does not compute Fisher's C for saturated models I removed one of the paths:
model_ha <- psem(
lmer( bci ~ shannon + cort + ha + (1|ID), data = metadata),
lmer(ha ~ cort + ( 1|ID), data = metadata),
lmer(shannon ~ ha + (1|ID), data = metadata),
data= metadata)
summary(model_ha)
Structural Equation Model of model_ha
Call:
bci ~ shannon + cort + std_ha
ha ~ cort
shannon ~ ha
AIC
583.891
---
Tests of directed separation:
Independ.Claim Test.Type DF Crit.Value P.Value
shannon ~ cort + ... coef 74.6271 0.0225 0.8813
--
Global goodness-of-fit:
Chi-Squared = 2.137 with P-value = 0.144 and on 1 degrees of freedom
Fisher's C = 0.253 with P-value = 0.881 and on 2 degrees of freedom
Now the significance of the missing path is calculated in the test of directed separation. And Fisher'C can be calculated as well.
Is this an option? Then for visualization I can just add the missing path to the diagramm?
Also found this cool R package "semEff" that allows to bootstrap the calculation of effects for structural equation models:
system.time(
model_ha_boot <- bootEff(model_ha, R = 1000, seed = 123, ran.eff = "ID",parallel = "multicore"))
model_ha_eff <- semEff(model_ha_boot)
summary(model_ha_eff)
SEM direct, summed indirect, total, and mediator effects:
bci (1/3):
Effect Bias Std. Err. Lower CI Upper CI
------ ------ --------- -------- --------
DIRECT cort | -0.375 | 0.049 | 0.091 | -0.578 -0.251 | *
ha | -0.168 | 0.039 | 0.099 | -0.368 -0.004 | *
shannon | 0.189 | 0.009 | 0.089 | 0.011 0.354 | *
INDIRECT cort | 0.120 | -0.014 | 0.073 | -0.002 0.294 |
cort | -0.025 | 0.000 | 0.035 | -0.107 0.038 |
TOTAL cort | -0.255 | 0.035 | 0.068 | -0.478 -0.169 | *
ha | -0.193 | 0.038 | 0.105 | -0.451 -0.021 | *
shannon | 0.189 | 0.009 | 0.089 | 0.011 0.354 | *
MEDIATORS ha | 0.120 | -0.014 | 0.073 | -0.002 0.294 |
shannon | -0.009 | 0.001 | 0.011 | -0.042 0.006 |
ha (2/3):
Effect Bias Std. Err. Lower CI Upper CI
------ ------ --------- -------- --------
DIRECT cort | -0.619 | -0.065 | 0.092 | -0.726 -0.358 | *
INDIRECT n/a | - | - | - | - - |
TOTAL cort | -0.619 | -0.065 | 0.092 | -0.726 -0.358 | *
MEDIATORS n/a | - | - | - | - - |
shannon.entropy (3/3):
Effect Bias Std. Err. Lower CI Upper CI
------ ------ --------- -------- --------
DIRECT ha | -0.131 | -0.003 | 0.158 | -0.402 0.238 |
INDIRECT cort | 0.081 | 0.009 | 0.109 | -0.189 0.258 |
TOTAL cort | 0.081 | 0.009 | 0.109 | -0.189 0.258 |
ha | -0.131 | -0.003 | 0.158 | -0.402 0.238 |
MEDIATORS ha | 0.081 | 0.009 | 0.109 | -0.189 0.258 |
This also allows me to estimate the effect of the missing path (in bold). shannon (3/3): TOTAL cort | 0.081 | 0.009 | 0.109 | -0.189 0.258 |
I can also do a different model with a different missing path to double check the results (which are the same).
Is this a valid approach?
Cheers, Hugo
Hi Hugo, removing the path will free up information to calculate Fisher's C (and Chi-squared) but this reflects the goodness-of-fit of the reduced model. It would not be appropriate to assign that value to a graph with that path included. If the path you removed is theoretically not relevant to your questions, you can consider keeping that path removed (including in the path diagram) and report the model output below
HTH,
Jon
Jonathan Lefcheck, Ph.D.
Research Scientist
Integration and Application Network
University of Maryland Center for Environmental Science
www.jonlefcheck.nethttp://www.jonlefcheck.net
From: Hugo Eira @.> Sent: Wednesday, September 27, 2023 7:16 AM To: jslefche/piecewiseSEM @.> Cc: Jon Lefcheck @.>; Comment @.> Subject: Re: [jslefche/piecewiseSEM] model inplementation Fisher's C = NA (Issue #291)
Hi Jonathan,
Maybe this can be a way to work around it.
Since piecewiseSem does not compute Fisher's C for saturated models I removed one of the paths:
model_ha <- psem( lmer( bci ~ shannon + cort + ha + (1|ID), data = metadata), lmer(ha ~ cort + ( 1|ID), data = metadata), lmer(shannon ~ ha + (1|ID), data = metadata), data= metadata)
summary(model_ha)
Structural Equation Model of model_ha
Call: bci ~ shannon + cort + std_ha ha ~ cort shannon ~ ha
AIC
583.891
Tests of directed separation:
Independ.Claim Test.Type DF Crit.Value P.Value
shannon ~ cort + ... coef 74.6271 0.0225 0.8813
-- Global goodness-of-fit:
Chi-Squared = 2.137 with P-value = 0.144 and on 1 degrees of freedom Fisher's C = 0.253 with P-value = 0.881 and on 2 degrees of freedom
Now the significance of the missing path is calculated in the test of directed separation. And Fisher'C can be calculated as well.
Is this an option? Then for visualization I can just add the missing path to the diagramm?
Also found this cool R package "semEff" that allows to bootstrap the calculation of effects for structural equation models:
system.time( model_ha_boot <- bootEff(model_ha, R = 1000, seed = 123, ran.eff = "ID",parallel = "multicore"))
model_ha_eff <- semEff(model_ha_boot)
summary(model_ha_eff)
SEM direct, summed indirect, total, and mediator effects:
bci (1/3):
Effect Bias Std. Err. Lower CI Upper CI
------ ------ --------- -------- --------
DIRECT cort | -0.375 | 0.049 | 0.091 | -0.578 -0.251 | ha | -0.168 | 0.039 | 0.099 | -0.368 -0.004 | shannon | 0.189 | 0.009 | 0.089 | 0.011 0.354 | *
INDIRECT cort | 0.120 | -0.014 | 0.073 | -0.002 0.294 | cort | -0.025 | 0.000 | 0.035 | -0.107 0.038 |
TOTAL cort | -0.255 | 0.035 | 0.068 | -0.478 -0.169 | ha | -0.193 | 0.038 | 0.105 | -0.451 -0.021 | shannon | 0.189 | 0.009 | 0.089 | 0.011 0.354 | *
MEDIATORS ha | 0.120 | -0.014 | 0.073 | -0.002 0.294 | shannon | -0.009 | 0.001 | 0.011 | -0.042 0.006 |
ha (2/3):
Effect Bias Std. Err. Lower CI Upper CI
------ ------ --------- -------- --------
DIRECT cort | -0.619 | -0.065 | 0.092 | -0.726 -0.358 | *
INDIRECT n/a | - | - | - | - - |
TOTAL cort | -0.619 | -0.065 | 0.092 | -0.726 -0.358 | *
MEDIATORS n/a | - | - | - | - - |
Effect Bias Std. Err. Lower CI Upper CI
------ ------ --------- -------- --------
DIRECT ha | -0.131 | -0.003 | 0.158 | -0.402 0.238 |
INDIRECT cort | 0.081 | 0.009 | 0.109 | -0.189 0.258 |
TOTAL cort | 0.081 | 0.009 | 0.109 | -0.189 0.258 | ha | -0.131 | -0.003 | 0.158 | -0.402 0.238 |
MEDIATORS ha | 0.081 | 0.009 | 0.109 | -0.189 0.258 |
This also allows me to estimate the effect of the missing path (in bold). shannon (3/3): TOTAL cort | 0.081 | 0.009 | 0.109 | -0.189 0.258 |
I can also do a different model with a different missing path to double check the results (which are the same).
Is this a valid approach?
Cheers, Hugo
— Reply to this email directly, view it on GitHubhttps://www.google.com/url?q=https://github.com/jslefche/piecewiseSEM/issues/291%23issuecomment-1737195191&source=gmail-imap&ust=1696418173000000&usg=AOvVaw2M3tQ8V87DKs6dXeZhz4ZD, or unsubscribehttps://www.google.com/url?q=https://github.com/notifications/unsubscribe-auth/AAR4AV6THT7KB2IHN5DPO43X4QDHXANCNFSM6AAAAAA5HW4C4A&source=gmail-imap&ust=1696418173000000&usg=AOvVaw0lKQahv1STOeZ16o5StuYU. You are receiving this because you commented.Message ID: @.***>
Hi Jonathan, Thanks so much for guiding me through this, I have been stuck on this for a couple of months now :(
Indeed the fitness measures reflect the reduced model, wasn't thinking about this.
The most relevant model is the saturated model and I can't find a proper reason to exclude any of the paths. Can I present the results without a fitness measure?
"or rely on other, model-specific indicators of fit, such as R^2s" You mention this in one of your previous answers. Are you talking about computing confidence intervals for each of the R^2s and present those as support for the model?
Really sorry about the extensive questions.
Kind regards, Hugo
Hi Hugo, its perfectly fine to present the saturated model. In which case, you can acknowledge that you have no degrees of freedom leftover with which to calculate your goodness-of-fit indices
In that case, yes, I would rely on the strength and significance of individual pathways (examine the standard errors and P-values) and the variance explained (R^2) to build a qualitative argument for why the entirety of the path diagram adequately captures correlations indicated in the data.
HTH,
Jon
Jonathan Lefcheck, Ph.D.
Research Scientist
Integration and Application Network
University of Maryland Center for Environmental Science
www.jonlefcheck.nethttp://www.jonlefcheck.net
From: Hugo Eira @.> Sent: Thursday, September 28, 2023 1:55 AM To: jslefche/piecewiseSEM @.> Cc: Jon Lefcheck @.>; Comment @.> Subject: Re: [jslefche/piecewiseSEM] model inplementation Fisher's C = NA (Issue #291)
Hi Jonathan, Thanks so much for guiding me through this, I have been stuck on this for a couple of months now :(
Indeed the fitness measures reflect the reduced model, wasn't thinking about this.
The most relevant model is the saturated model and I can't find a proper reason to exclude any of the paths. Can I present the results without a fitness measure?
"or rely on other, model-specific indicators of fit, such as R^2s" You mention this in one of your previous answers. Are you talking about computing confidence intervals for each of the R^2s and present those as support for the model?
Really sorry about the extensive questions.
Kind regards, Hugo
— Reply to this email directly, view it on GitHubhttps://www.google.com/url?q=https://github.com/jslefche/piecewiseSEM/issues/291%23issuecomment-1738517376&source=gmail-imap&ust=1696485334000000&usg=AOvVaw257ngLmA_Wgp-RMTc5MWFR, or unsubscribehttps://www.google.com/url?q=https://github.com/notifications/unsubscribe-auth/AAR4AV4TGHZG2ZEJG2LDOH3X4UGNJANCNFSM6AAAAAA5HW4C4A&source=gmail-imap&ust=1696485334000000&usg=AOvVaw0BSQG1POT9P9elYeog0f5j. You are receiving this because you commented.Message ID: @.***>
Hi Jonathan,
Thanks so much for such a detailed feedback, this was really helpful.
I have been struggling to figure out how/if I can support my results.
By the way I found this pre-print that kind tests and summarizes different scenarios of structural equation modeling and how to analyse model fitness when fit measures might not be computed/informative.
This diagram sums it up:
Link for the pre print: https://arxiv.org/abs/1803.06186
Thanks so much again.
Kind regards, Hugo
Hi,
Sorry to bother with might be a basic question.
I am implementing a SEM model, which seems straightforward to me. Nevertheless the results are bit weird.
I have two repeated measure and thus controlling for individual ID. I have 43 individuals sampled twice (I am guessing the error that I am encountering is from an overfitted model).
summary(model_ha)
First error:
This I understand: there is almost no variance coming from the random effects. I have also some missing data in 3 or 4 samples, which I think it should be a major problem. Is there a way to tell piecewise SEM how to deal with missing data?
Second error:
The fact that there are no indepence claims (I am guessing) it's because I fitted all the possible paths to the model, right?
What I really do not understand is why Fisher's C cannot be calculated??? Is it model overfit? Low sample size? Or is it something with the model description itself?
Thanks in advance. Kind regards, Hugo