markmfredrickson / optmatch

Functions for optimal matching in R
https://markmfredrickson.github.io/optmatch
Other
47 stars 14 forks source link

strata with . in formula #115

Closed josherrickson closed 8 years ago

josherrickson commented 8 years ago

There's an issue thats arising with . in a formula that also includes a strata() call.

> data <- data.frame(z = rep(0:1, each=5),
                    x = rnorm(10),
                    s = rep(0:1, times=5))

> m <- match_on(z ~ x + strata(s), data=data)
> m <- match_on(z ~ . - s + strata(s), data=data)
 Error in terms.formula(tmp, simplify = TRUE) : 
  '.' in formula and no 'data' argument 

This seems familiar, but I couldn't find an existing issue.

It comes from what appears to be a bug in update.formula,

> update(y ~ x, . ~ . - x)
y ~ 1
> update(y ~ . + x, . ~ . - x)
Error in terms.formula(tmp, simplify = TRUE) : 
  '.' in formula and no 'data' argument
> update(y ~ . + x, . ~ . - x, data=data.frame(y=1:2, x=1:2))
Error in terms.formula(tmp, simplify = TRUE) : 
  '.' in formula and no 'data' argument

where the data argument isn't being passed down to an internal terms.formula call.

I can't figure out a way around this bug. Perhaps we try to catch the existence of both . and strata() in the formula, and stop with a more informative error? We could likely solve this with some formula-as-character manipulation, but I don't know if its worth it - any fix I can think of would be quite fragile.

benthestatistician commented 8 years ago

Good catch, Josh. I vote for the informative error solution.

On Tue, Apr 19, 2016 at 11:09 AM, Josh Errickson notifications@github.com wrote:

There's an issue thats arising with . in a formula that also includes a strata() call.

data <- data.frame(z = rep(0:1, each=5), x = rnorm(10), s = rep(0:1, times=5)) m <- match_on(z ~ x + strata(s), data=data)> m <- match_on(z ~ . - s + strata(s), data=data) Error in terms.formula(tmp, simplify = TRUE) : '.' in formula and no 'data' argument

This seems familiar, but I couldn't find an existing issue.

It comes from what appears to be a bug in update.formula,

update(y ~ x, . ~ . - x)y ~ 1> update(y ~ . + x, . ~ . - x)Error in terms.formula(tmp, simplify = TRUE) : '.' in formula and no 'data' argument> update(y ~ . + x, . ~ . - x, data=data.frame(y=1:2, x=1:2))Error in terms.formula(tmp, simplify = TRUE) : '.' in formula and no 'data' argument

where the data argument isn't being passed down to an internal terms.formula call.

I can't figure out a way around this bug. Perhaps we try to catch the existence of both . and strata() in the formula, and stop with a more informative error? We could likely solve this with some formula-as-character manipulation, but I don't know if its worth it - any fix I can think of would be quite fragile.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/markmfredrickson/optmatch/issues/115