Open mattansb opened 1 year ago
Hi @rvlenth,
Thanks for your help on the issue of dealing with missing data. I have taken your advice and solved this by allowing the user to pass a data=
argument with non-missing data, which I deal with in recover_data.lavaan()
.
For ref_grid()
and emmeans()
this seems to work fine.
However, for emtrends()
I am getting estimation problems. Using some debugging, I've found that when emtrends()
is called, it called recover_data()
twice, but only passes the user's data=
argument the first time. I'm assuming this in not intentional?
Thanks!
# remotes::install_github("mattansb/semTools") # install this PR
library(semTools)
library(emmeans)
data("mtcars")
raw_mtcars <- mtcars
mtcars$hp[1] <- NA
model <- " mpg ~ hp + drat + hp:drat "
fit <- sem(model, mtcars, missing = "fiml.x")
(rg <- ref_grid(fit,
lavaan.DV = "mpg",
data = raw_mtcars))
#> 'emmGrid' object with variables:
#> hp = 146.69
#> drat = 3.5966
rg@linfct
#> (Intercept) hp drat hp:drat
#> 1 1 146.6875 3.596563 527.5708
(emM <- emmeans(fit, ~ drat, var = "hp",
lavaan.DV = "mpg",
data = raw_mtcars))
#> drat emmean SE df asymp.LCL asymp.UCL
#> 3.6 20 0.614 Inf 18.8 21.2
#>
#> Confidence level used: 0.95
emM@linfct
#> (Intercept) hp drat hp:drat
#> [1,] 1 146.6875 3.596563 527.5708
(emT <- emtrends(fit, ~ drat, var = "hp",
lavaan.DV = "mpg",
data = raw_mtcars))
#> drat hp.trend SE df asymp.LCL asymp.UCL
#> 3.6 nonEst NA NA NA NA
#>
#> Confidence level used: 0.95
emT@linfct
#> (Intercept) hp drat hp:drat
#> [1,] 0 NA 0 NA
I'm not at all sure that it isn't intentional. The first call to ref_grid()
includes a hook to return the data, so that we can set up the difference quotients. The second time we call it, we put another hook that bypasses some stuff already done in the first call. I'll have to look at it to see if we need the data the second time.
I think it is right the way it is. The setup for the first call to ref_grid()
includes this code:
rgargs = list(object = object, ...)
. . .
data = do.call("ref_grid", c(rgargs))
So if data
is included in the ...
in the emtrends()
call, it gets passed to ref_grid()
. As you can see, the purpose of that first call is to retrieve the data (via a special hook included in rgargs
).
The second call to ref_grid()
is
bigRG = do.call("ref_grid", c(rgargs, data = data))
where data
is the data already retrieved in the first call.
So actually I'm confused by your statement that data is passed the first time and not the second, because what we actually have is data being explicitly passed the second time, and only implicitly passed the first time.
OK, my bad! It turns out that if rgargs
is a list and data
is a data frame with variables x
and y
, then c(rgargs, data = data)
is a list with additional elements data.x
and data.y
. So I put in an additional line of code to add data
itself to the list, and confirmed in debug mode that the right stuff is being passed.. You can install from GitHub and see if it works right now.
Hey, this almost fixes the issue. I now get a new error:
(emT <- emtrends(fit, ~ drat, var = "hp",
lavaan.DV = "mpg",
data = raw_mtcars))
#> Error in lav_data_full(data = data, group = group, cluster = cluster, :
#> lavaan ERROR: some (observed) variables specified in the model are not found in the dataset: mpg
This is because the data
being passed to recover_data()
the second time only has the data for the predictors (from the first pass of recover_data()
), but lavaan
needs the full multivariate/multivariable dataset.
Can we not simply pass the original data=
argument the second time as well?
You can use the addl.vars
argument, e.g., addl.vars = "mpg"
By the way, in your emmeans
support code for lavaan, since you need the response variable, I recommend you retrieve its name from the ressponse part of the model formula, and include that as addl.vars
in the call to recover_data()
. Then you won't have to rely on the user providing that in their call. See the help page for emmeans::recover_data
.
Any update on this issue? Has this been added to simsem?
@patc3 No additional updates from me (emmeans) since my last comment. My repairs to recover_data
are in the latest CRAN version and AFAIK, the additional notes (e.g., using addl.vars
) will provide access to all the needed variables.
Sorry @patc3 - I haven't found the time to get back to this just yet.
This is a WIP