Open Rasmusma1995 opened 6 months ago
Hi, and glad you find the software useful!
Hmmm, it works on my machine:
fixest:::fixest_fml_rewriter(as.formula(y~ gender + Løn^2))
$fml
y ~ gender + I(Løn^2)
<environment: 0x000001c6437a03e8>
$isPanel
[1] FALSE
The current rewriting of "x^2"
into "I(x^2)"
uses a lot of regular expressions. In particular, I use "[[:alnum:]]"
to catch letters and deduce variables' names.
Can you replicate the following result?
gsub("[[:alnum:]]", "_", "Løn^2")
[1] "___^_"
If not, it seems that the current interpretation of the character signs differ between your machine and mine. Possible solutions:
In any case, writing explicitly " I(Løn^2)" should work (and this is the native R way to do it).
It seem that gsub does produce the same result:
gsub("[[:alnum:]]", "_", "Løn^2")
[1] "___^_"
However explicitly writing "I(Løn^2)" produce an even weirder result:
fixest:::fixest_fml_rewriter(as.formula(y~ gender + I(Løn^2)))
$fml
y ~ gender + I(LøI(n^2))
<environment: 0x56247917cad0>
$isPanel
[1] FALSE
The problem seem to arise from the following steps:
fml_text = fixest:::deparse_long(as.formula(y~ gender + Løn^2))
fml_text
[1] "y ~ gender + Løn^2"
no_lhs_text = gsub("^[^~]+~", "", fml_text)
no_lhs_text
[1] " gender + Løn^2"
no_lhs_text = gsub("(?<!I\\()(\\b(\\.[[:alpha:]]|[[:alpha:]])[[:alnum:]\\._]*\\^[[:digit:]]+)", "I(\\1)",
no_lhs_text, perl = TRUE)
no_lhs_text
[1] " gender + LøI(n^2)"
Session info:
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux
Matrix products: default
BLAS: /opt/R/4.2.1/lib64/R/lib/libRblas.so
LAPACK: /opt/R/4.2.1/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
character(0)
other attached packages:
[1] fixest_0.11.2
Hello and first off, thank you for developing this fantastic package! It's been incredibly useful.
That being said it does however seem to have a problem with combining special characters from e.g. foreign languages with polynomial expressions of covariates. It seems like fixest::feglm function misinterpret the formula, leading to an error.
Small example: