rbchan / unmarked

R package for hierarchical models in ecological research
https://rbchan.github.io/unmarked/
37 stars 25 forks source link

model.frame will look for covariates in global environment if they aren't in the covariate data frame #259

Open kenkellner opened 11 months ago

kenkellner commented 11 months ago
a <- rnorm(10)
dat <- data.frame(b=rnorm(10), c=rnorm(10))
x <- model.frame(~a+b+c, data=dat)

a is not in the data frame passed to model.frame, but annoyingly R will go find it in the global environment and paste it in anyway.

This causes problems because users might save a covariate to the global environment, then add it to obsCovs or yearlySiteCovs and then try to reference it in a formula for say, lambda. Instead of an error indicating the covariate is not available in siteCovs, R will find it in the global environment. Sometimes you get a dimension error and sometimes it works, silently giving you potentially nonsense results.

There is no way to force model.frame not to do this, so we'd have to either control the environment explicitly or just insert a manual check before the call to model.frame to make sure all elements of the formula exist in the data frame e.g.

attr(terms(~a+b+c), "term.labels") %in% names(siteCovs)

This is quite annoying though because we have to change it in every one of the many places we use model.frame.

rbchan commented 11 months ago

This is a problem with R's scoping rule's. I don't think it's something we should try to fix. Users or R just need to understand the issue.