Open Gootjes opened 1 year ago
What a nice idea! I would make one top-down suggestion at this point, before getting into the details of the code, and that is to try to follow the tidySEM
design philosophy as closely as possible. So the bifactor()
function should work similarly to the measurement()
function, and there really shouldn't be a run_bifactor()
function, especially not one that only works with lavaan.
Smaller notes:
"G"
add_paths()
no?compute_omega
function generic, so it works with arbitrary sem models and not just with lavaan. The way to do this is to have a function compute_omega.default()
that accepts only exported coefficients, and to have functions like compute_omega.lavaan()
grab those coefficients from lavaan output and then call the default. Also: rename to omega()
?I have removed run_bifactor()
And created a omega.lavaan
and omega.default
. The latter requires two arguments that are completely different classes (parameter table of class data.frame
and covariance matrix of class matrix
) than those of omega.lavaan
(lavaan
fit object). Is this the way to go? Or should the .default
implementation call the data.frame
implementation?
bifactor
and omega()
.psych
. The syntax to add a cross-loading is a mistake-prone for an end-user because the defaults of add_path
are not what is needed, so it needs custom defaults.
The current situation looks like the code below. As a solution, should I give the result of bifactor
an custom class tidy_sem_bifactor
such that I can implement a custom add_paths
?
bf %>% add_paths("a =~ c_2", "", std.lv = T,
auto.fix.single = T,
auto.fix.first = F,
orthogonal = T,
int.ov.free = F,
int.lv.free = F,
meanstructure = F) %>% run_lavaan() -> bf_fit2
The syntax to add a cross-loading is a mistake-prone for an end-user because the defaults of add_path are not what is needed, so it needs custom defaults. The current situation looks like the code below. As a solution, should I give the result of bifactor an custom class tidy_sem_bifactor such that I can implement a custom add_paths ?
bf %>% add_paths("a =~ c_2", "", std.lv = T, auto.fix.single = T, auto.fix.first = F, orthogonal = T, int.ov.free = F, int.lv.free = F, meanstructure = F) %>% run_lavaan() -> bf_fit2
Not sure what the problem is... are you trying to prevent model non-identification by fixing variances of indicators that load on too many factors?
The issue is that without any parameters to add_paths, it makes the cross-loading fixed at value 1, and estimates an intercept. I don't want either: latent variables have fixed variance 1, and no intercepts should be estimated (fixed to 0). But what happens is that the implementation of add_paths adds them again to the model for the variable mentioned in the arguments.
tidy_sem(c("a_1", "a_2", "a_3", "b_1", "b_2", "b_3", "c_1", "c_2", "c_3", "c_4")) %>%
bifactor() %>%
# This adds a cross-loading
add_paths("a =~ c_4") %>%
as_lavaan() %>%
filter(lhs == "c_4" | rhs == "c_4")
# lhs op rhs block group free label ustart plabel
# 1 G =~ c_4 1 1 1 NA .p10.
# 2 c =~ c_4 1 1 1 NA .p20.
# 3 c_4 ~~ c_4 1 1 1 NA .p30.
# 4 c_4 ~1 1 1 0 0 .p50.
# 5 a =~ c_4 1 1 0 1 .p55.
The options to add_paths
that solve this are int.ov.free = F, auto.fix.first = F
. But I find it kind of a hassle that respecifying this is necessary. Maybe add_paths
should reuse the options that were specified earlier in the call to measurement
?
An alternative solution is to make bifactor
add a class to the tidy_sem
object, e.g. tidy_sem_bifactor
, and then:
add_paths.tidy_sem_bifactor <- function() {
cl <- match.call()
cl["int.ov.free"] <- FALSE
cl["auto.fix.first"] <- FALSE
# Copy pasted from add_paths.tidy_sem
cl["model"] <- list(model$syntax)
# I have to comment this line though otherwise R complains
# about a missing implementation for group_var for the tidy_sem_bifactor class.
# I don't need it anyways.
# if(!is.null(group_var(model))) cl[["ngroups"]] <- group_var(model, "ngroups")
cl[[1L]] <- quote(add_paths)
#Args <- c(list(model = model$syntax), as.list(match.call()[-c(1:2)]))
model$syntax <- eval.parent(cl)
return(model)
}
Maybe I am using measurement
not what it was intended for, but I expect this issue to crop up again when someone uses add_paths
.
Should I maybe change bifactor
to include arguments for cross-loadings? Then calling add_paths
of a bifactor tidy_sem is not necessary anymore: everything is set from the get go. But then I need to copy quite some code from what is already there for CFA.
Semi-related point: I can tweak tidy_sem$syntax to include an equality constraint by specifying .p1. == .p2.
, but this does not translate when using as_ram
or as_mplus
. Is there already something in place to do equality constraints in cfa models?
As per standard lavaan syntax, to specify equality constraints you could use:
F=~a*x1
F=~a*x2
I need to plan in some time to look into the issue you're experiencing with add_paths, can't speak to that now. The risk of implicit defaults (as in your add_paths.tidy_sem_bifactor solution) is that, as soon as someone has a different use case than you had in mind, they break down (ironically, this might be happening to you; I have to investigate to see). I agree that manually specifying these arguments to add_paths is awkward, but at least it's explicit!
PS The .p1. labels are, to my understanding, intended as lavaan internals.
As per standard lavaan syntax, to specify equality constraints you could use:
F=~a*x1 F=~a*x2
I need to plan in some time to look into the issue you're experiencing with add_paths, can't speak to that now. The risk of implicit defaults (as in your add_paths.tidy_sem_bifactor solution) is that, as soon as someone has a different use case than you had in mind, they break down (ironically, this might be happening to you; I have to investigate to see). I agree that manually specifying these arguments to add_paths is awkward, but at least it's explicit!
Haha yes it is happening to me. Yea so reimplementing some of the logic in measurement
to make it apply to the bifactor situation might be the best approach then. Accidental misuse is far less likely, and bifactor models are so well defined that customisation options for the end user are not a priority.
Regarding equality constraints. Thanks for the quick reply, I was trying to do it all manually by adding rows to the syntax data.frame. I understand now that I can do:
add_paths(tidy_sem(x), "F=~a*x_1; F=~a*x_2; F=~a*x_3")
And that runs with mx and lavaan. Nice! That helps. I was trying to do add_paths
on a tidy_sem with existing syntax (from measurement
), but that doesn't produce valid syntax. So another reason to rewrite bifactor to do it all my way from the get go.
There is no need right now for you to invest time in reading this all. I will redesign the bifactor a little based on code you wrote for measurement
and then rely heavily on add_paths
to do the actual magic.
Tests already work, I can reproduce results from the psych::omegaFromSem
function based on psych
datasets. (Turns out that function has a slightly odd way of doing things regarding cross loadings and also mixes unstandardized loadings with correlation matrices to compute variances...).
So after this is done and I factor out tidyverse usage, I will press the request review button :)
That sounds good, thank you for being proactive and understanding! Don't forget to add yourself as "ctb" to the authors@R field in the DESCRIPTION file too.
I see you're making a commit here as well.. would you like me to incorporate this into 0.2.7 as well? If so: it still needs tests for the new functions, and I see that it's failing (for some reason) the integration tests.
Would you please make sure that these functions align as closely as possible to the interface of functions like add_paths()
and measurement()
?
The implementation is not entirely complete because I ran into this unsolved issue #74 Let's not merge this yet.
The mentioned issue is preventing me from imposing equality constraints in the case when a latent variable has only two indicators.
I can't get into this until after end March due to time constraints!
No problem!
This PR is a Proof of Concept. I would like to receive feedback on how to better integrate this with the tidySEM framework.
EDIT: finished