simsem / semTools

Useful tools for structural equation modeling
74 stars 35 forks source link

Patch auxiliary.R #107

Closed bkeller2 closed 2 years ago

bkeller2 commented 2 years ago

Patches auxiliary.R to work when the lavaan package or semTools package has not been loaded via a library function call.

Currently, if lavaan's namespace isn't loaded, then do.call will throw an error. This prevents being able to do calls like semTools::auxiliary()

TDJorgensen commented 2 years ago

What is the context for "needing" this change? It seems to enable poor R behavior, like lazy loading. If you are using these packages, they should be loaded. There is a reason for the NAMESPACE in specifying dependencies. If you are a package developer, your NAMESPACE can specify to import auxiliary from semTools, so your package will have the proper dependency structure.

Is something else going on, like conditionally passing the name auxiliary (or some other function) to do.call, so it cannot be explicitly passed with the :: operator?

bkeller2 commented 2 years ago

Here is a minimal working example (bkeller2/semTools-Example-Package) of importing sem.auxilliary from semTools like your suggestion.

When calling the single hello() function, an error occurs:

> library(package)
> hello()
Error in sem(data = list(x = c(-2.12494168273042, -0.478223409376106,  : 
  could not find function "sem"

The hello function source code:

# Minimal Working Example
#' @export
hello <- function() {
    dat <- data.frame(x = rnorm(100), y = rnorm(100), z = rnorm(100))
    model <- c("x ~ y")
    sem.auxiliary(model, dat, aux = c('z'))
}

Maybe I am missing something or did not set up the package correctly, but I don't believe your suggestion works. With that being said, there might be a better fix than the one I have proposed.

TDJorgensen commented 2 years ago

Sorry I lost track of this. Closing the issue brought it back to my attention.

In your working example, the package package would need to include semTools in its dependency structure (e.g., among Imports if the package package would not work without semTools, or among Suggests if the hello() function is just an optional feature it provides). If it is only under Suggests, then hello() should check whether semTools is loaded before doing anything else. For example, the way runMI() checks for Amelia or mice when necessary: https://github.com/simsem/semTools/blob/master/semTools/R/runMI.R#L210 Then, if a suggested packages is not installed, an error will be issued saying so; but an installed package will be loaded when appropriate.

bkeller2 commented 2 years ago

That is the problem. Placing semTools on Imports is not enough. To have sem.auxiliary working, I am required to place semTools under Depends. I was trying to avoid having to place the entire semTools package under Depends. The alternative is to place lavaan under Depends instead, and I was just suggesting a way around this.

TDJorgensen commented 2 years ago

Placing semTools on Imports is not enough

Rather than placing it on Imports (which would require semTools and therefore lavaan to be installed when your package is), you can place it instead on Suggests (which does not install semTools + dependencies when your package is installed). I recommended this because it sounds like using semTools::auxiliary() is just an optional feature, not necessary for your package to work: https://r-pkgs.org/description.html#description-dependencies

See also https://r-pkgs.org/namespace.html#search-path about differences between Depends, Imports, and Suggests.

I might have more helpful advice if you show me your actual source code, but based on your working example, here is what I meant when I linked you to my own code that checks whether an optional package is attached whenever that optional functionality is called. If it is necessary to attach the optional package, then it will be attached along with dependencies. In this case, when semTools::auxiliary() is required, it will be attached along with lavaan, avoiding your issue.

hello <- function() {
    dat <- data.frame(x = rnorm(100), y = rnorm(100), z = rnorm(100))
    model <- c("x ~ y")

    ## check whether the optional package is installed
    requireNamespace("semTools") # returns informative error when not installed
    ## attach the optional package (also attaches dependencies: lavaan)
    if (!"package:semTools" %in% search()) attachNamespace("semTools")

    ## Now this should work fine.  Use the :: operator so you
    ## don't have to place sem.auxiliary() in the NAMESPACE file.
    semTools::sem.auxiliary(model, dat, aux = c('z'))
}
bkeller2 commented 2 years ago

So I placed it under Suggests and installed semTools package, but the suggested code still produces the same error above.

> library(semToolTest); hello()
Error in sem(data = list(x = c(0.358571223545468, -0.617445794994912,  : 
  could not find function "sem"

Here is the package building environment used: https://github.com/bkeller2/semToolTest

It does not attach lavaan's namespace, so lavaan::sem is not available when expected in the semTools::auxiliary function. To verify this I printed search before and after the line you suggest in the hello function:

BEFORE:
.GlobalEnv
package:semToolTest
tools:rstudio
package:stats
package:graphics
package:grDevices
package:datasets
renv:shims
package:utils
package:methods
Autoloads
package:base

###############################################################################
This is semTools 0.5-5
All users of R (or SEM) are invited to submit functions or ideas for functions.
###############################################################################
AFTER:
.GlobalEnv
package:semTools
package:semToolTest
tools:rstudio
package:stats
package:graphics
package:grDevices
package:datasets
renv:shims
package:utils
package:methods
Autoloads
package:base

The code correctly loads in semTools but not its dependencies (e.g., lavaan).

I appreciate the offer of help, the solution was to make sure to have lavaan already loaded; I was just suggesting that it would be better to have semTools handle this directly. Because of how the auxiliary function is written, it appears that R isn't recognizing the dependency.

TDJorgensen commented 2 years ago

The code correctly loads in semTools but not its dependencies

Oh, that's annoying, and kind of flies in the face of what I thought I understood about the difference between loading and attaching a namespace. Well, that complicates having a second-order dependency via a suggested package. One obvious "quick" solution on your end would be to conditionally attach both:

if (!"package:semTools" %in% search()) attachNamespace("semTools")
if (!"package:lavaan"   %in% search()) attachNamespace("lavaan")

I'm uncomfortable using the lavaan() function as an environment in your patch, but I've also tried to find ways to specify a package's namespace as an environment for do.call(). This time I was able to find a couple solutions, starting with some posted here: https://stackoverflow.com/questions/38983179/do-call-a-function-in-r-without-loading-the-package

## get the function from the package, without requiring it to be in search()
getFromNamespace("cfa", "lavaan") # useful to build a call() to eval()
# or use the package's namespace as an environment
do.call(..., envir = getNamespace("lavaan"))

I just implemented the second approach in auxiliary(), which should resolve your problem (https://github.com/simsem/semTools/commit/8f7bb2032351dba45f899c81a1333e8c5b5d7c56). When I installed the development version to check, along with your semToolTest repository, your hello() example worked fine. But feel free to check on your actual software. If this seems like a good solution, I will add this to do.call throughout the package whenever calling functions not in base.

bkeller2 commented 2 years ago

Thanks!

I agree that my solution of looking for the lavaan function wasn't ideal, and this is a much better solution. I tried to find something like your suggestion, but I couldn't.

My only comment would be if you wanted to set up auxiliary() to have env as an argument. That way, it could easily be extended to use something like the blavaan package if desired.

TDJorgensen commented 2 years ago

to have env as an argument. That way, it could easily be extended to use something like the blavaan package if desired

That's a great idea! I will still only have wrappers for the lavaan functions, but perhaps in the future it can enable something like auxiliary(..., fun="bcfa", envir = getNamespace("blavaan")). It will not work currently because a parTable for lavaan is created internally, which lacks the additional columns for blavaan (e.g., priors). Not sure how complicated it would be to update it accordingly, but I will put it on my TODO list.

In the meantime, I also added a return.syntax=FALSE argument, which can be set TRUE in order to return the character string specifying the saturated-correlates part of the model. That can be added to model syntax and passed to blavaan.