The error in rwolf() described below goes back to a pretty bad bug in wildboottest. In a nutshell, the R vector was incorrectly initiated. Unfortunately, my tests did not catch this, as I used np.assert_allclose, which does not throw errors with pytest 🤕. Sorry! A bug fix is on its way.
Example
The rwolf() method seems to produce incorrect results.
import numpy as np
import pandas as pd
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri
from rpy2.robjects.packages import importr
from pyfixest.estimation.estimation import feols
from pyfixest.estimation.multcomp import _get_rwolf_pval, bonferroni, rwolf
from pyfixest.utils.utils import get_data
pandas2ri.activate()
fixest = importr("fixest")
wildrwolf = importr("wildrwolf")
stats = importr("stats")
broom = importr("broom")
fit1 = feols("Y ~ X1 | f1 + f2", data=data)
fit2 = feols("Y2 ~ X1 | f1 + f2", data=data)
fit3 = feols("Y3 ~ X1 | f1 + f2", data=data)
rwolf_py = rwolf([fit1, fit2, fit3], "X1", reps=9999, seed=12345)
# R
fit_r = fixest.feols(ro.Formula("c(Y, Y2, Y3) ~ X1 | f1 + f2"), data=data)
rwolf_r = wildrwolf.rwolf(fit_r, param="X1", B=9999, seed=12345)
The python implementation produces
rwolf_py
while R's outputs
pd.DataFrame(rwolf_r)
Check the last rows in both data frames - the first Romano Wolf pvalue is drastically different.
Likely, the issue is in the rwolf function and not in get_rwolf_pvalue(), as the ladder is directly tested against the r package and produces identical results.
Update
The error in
rwolf()
described below goes back to a pretty bad bug inwildboottest
. In a nutshell, theR
vector was incorrectly initiated. Unfortunately, my tests did not catch this, as I usednp.assert_allclose
, which does not throw errors withpytest
🤕. Sorry! A bug fix is on its way.Example
The
rwolf()
method seems to produce incorrect results.The python implementation produces
while R's outputs
Check the last rows in both data frames - the first Romano Wolf pvalue is drastically different.
Likely, the issue is in the rwolf function and not in get_rwolf_pvalue(), as the ladder is directly tested against the r package and produces identical results.
Why did the tests did not catch the error? Incorrect use of the
rtol
andatol
function arguments innp.testing.assert_allclose
.