Closed kennchua closed 5 months ago
The reason this is a hard challenge is that the formula object is evaluated within the df
environment, so it looks for refgrp
within df
and doesn't find it. Of course you have .[]
notation which lets you look for variables in df
using variables from the calling environment (yvar
and xvar
).
However, this ref problem is a bit different. You want to insert a character "setosa"
into a formula which is done here: https://github.com/lrberge/fixest/blob/6b852fa277b947cea0bad8630986225ddb2d6f1b/R/fixest_env.R#L559-L594
@lrberge, I believe the bug is here. Specifying mode = "numeric"
only looks for numbers and not characters. Replacing it as I did fixes this problem (though could cause bugs if var
in the call_env
is a function, for example).
# if(exists(var, envir = call_env, mode = "numeric")){
if(exists(var, envir = call_env)){
Here is a minimal reprex in case you think of an easy fix Laurent:
# works
refgrp = 6
fixest::feols(
mpg ~ i(cyl, ref = 6),
data = mtcars
)
# Does not work
refgrp = "setosa"
fixest::feols(
Sepal.Length ~ i(Species, ref = refgrp),
data = iris
)
@kennchua Here's two workarounds:
setFixest_fml
and specify the ref variable by hand. You can change ..speciesDummies
multiple times in the script and it will "swap" our the correct formula. setFixest_fml(..speciesDummies = ~ i(Species, ref = "setosa"))
est_reg <- function(df, yvar) {
reg <- feols(
.[yvar] ~ ..speciesDummies,
data = df
)
return(reg)
}
est_reg(iris, "Sepal.Length")
est_reg <- function(df, yvar, xvar, refgrp) {
fml = as.formula(paste0(
".[yvar] ~ i(.[xvar], ref = '", refgrp, "')"
))
reg <- feols(
fml,
data = df
)
return(reg)
}
est_reg(iris, "Sepal.Length", "Species", "setosa")
Hi @kylebutts, thanks for carefully explaining what is going on under the hood and providing workarounds. Both solutions look good. Appreciate your help!
@kennchua, happy to help! Could you reopen this issue so that Laurent can see it?
Thanks @kennchua for reporting, your use case was totally valid. Thanks @kylebutts for finding the problem. I agree with you it was too restrictive. I also took the advantage to fix a bug o these lines!
Thanks all!
Hello, I was hoping to ask for clarity on the error message I'm receiving when passing a string argument to
ref
ini()
within a function.When not using the i() within a function, the following works fine:
But when using the i() within a function, I get the error: The variable 'refgrp' is in the RHS of the formula but not in the data set.
Is
ref
looking for a variable in the dataset as opposed to a value of the variable when used within a function but not when used directly?Thanks in advance for your help!
(P.S. FWIW, there is no issue when I am passing a reference category that is numeric.)