Is boot_algo3_crv3 in boot_algo3_cpp.cpp not used?

skranz commented 1 month ago

Hi Alexander,

first of all: this is really a great R package. Thanks a lot for creating it!

When I tested the "fastnreliable" standard errors, bootstrap_type ‘31’ was computed super fast, while '33' and '13' took quite a while. I wonder whether one of the reasons is that boot_algo_fastnreliable.R uses the C++ function boot_algo3_crv1_denom for '11' and '31' but does not call the existing C++ function boot_algo3_crv3 in boot_algo3_cpp.cpp for '13' and '33'. I wondered: is there a bigger hurdle in using boot_algo3_cpp? Does it just not yield substantial speed gains? Or, would it be relatively simple to modify the code in boot_algo_fastnreliable.R to use it?

Background: I am currently working on a large-scale meta study where I plan to analyze hundreds or perhaps even thousands of regressions with cluster robust standard errors from reproduction packages from published economic articles. For each regression, I try to build a Monte-Carlo simulation to evaluate different cluster-robust standard errors. Since I want to evaluate cluster bootstrap standard errors for 1000 MC samples for each of perhaps thousands of regressions, speed is of the essence. I am creating a slightly modified version of your package that minimizes recomputation of stuff that is shared across the MC samples. Your fastnwild code works incredibly fast (and seems to perform really well, also compared to CR2 with DF adjusted and Jackknive CR3), but I am not sure whether all fastnreliable algorithms will be fast enough for such a comprehensive meta study.

s3alfisc commented 1 month ago

Hi @skranz , thanks for your nice words!

To be honest I do not really recall a reason why the x3 bootstraps do not run the c++ function, it's a while go I implemented all of them (around when the MacKinnon et al paper came out). My best guess is that I read the MNW paper - and if I recall the conclusion correctly, they recommended the 31 bootstrap over the x3's, so I might have decided not to optimize these too much?

I think you're right - it indeed looks to me that this part could be easily replaced with the c++ code here, and this should speed up the x3 bootstraps substantially.

I can try take a stab at this the week after next week but cannot make any promises (I am very busy working on pyfixest at the moment). But I'd also be more than happy to accept a PR =)

Generally, I'd love to learn more about your paper! I am always very excited to see new simulation results on best practices for cluster robust inference. Also, are you in contact with MacKinnnon, Nielsen & Webb? I can only recommend to reach out, in my experience they are always very keen to learn about applications / novel theoretical & simulation work around wild bootsrapping =)

Best, Alex

skranz commented 1 month ago

Thanks a lot for your reply. Please don't use your time to check whether one can just replace the code by the C++ code. When I find myself a bit of time, I will test replacing the code, and if it works try to make a pull request.

Perhaps, though I will indeed just stick to fast & wild and the 31 (WCR-S) bootstrap. Your are right that the conclusions of the fast & reliable article state "If we had to recommend just one method, it would be the WCR-S bootstrap proposed in Section 5." and also " Of these, the ones that use CV 1 together with modified bootstrap score vectors, called WCR-S and WCU-S, are particularly easy to compute.".

Concerning my meta study, here is a very preliminary plot, you might find interesting:

For each of currently 150 regressions from 150 articles, I performed 1000 MC runs and counted how many times the p-values are below 10% given different methods for standard errors / wild bootstrap. The line for each method shows the shares of p-value below 10% for all regression coefficients sorted increasingly. The H0 was always true in the MC simulations, i.e. ideally each line should be constant at 0.1. (More precisely, the black line is the reference. It shows how the curve would look like if p-values are indeed uniformely distributed. It is not exact horizontal at 0.1 since even for uniformely distributed p-values there is randomness due to the fact that we only have 1000 MC draws for each regression).

Not surprisingly the Stata default CR1 performs worst. Interestingly the two wild bootstrap are much better than the adjusted CR2 (Bell, R. M. and D. F. McCaffrey (2002), Pustejovsky, J. E. and E. Tipton (2018)) and the jackknife CR3, which are both proposed as sensible alternatives. There just seem to be a few regressions, where CR2 and CR3 enormously fail, i.e. for some few regression coefficients p-values are in 80% of MC draws or more below 10%... I didn't find yet a single reason, but many of the problematic regressions, have few clusters, high leverage, high intra-cluster correlations, mostly non-invertible (I-M_ii) matrices etc. But perhaps there are also still some coding error in my simulations that drive these results... need to check everything carefully.

But if the results indeed are true, this might be a much stronger argument for wild bootstrap. (In the Fast&Reliable Paper the MC simulations did not really suggest that wild bootstrap is much better than jackknife CR3). I also find it interesting that so far in my MC studies,fast&wild does not seem to perform worse than the 31 bootstrap...

Once I have more robust results, I will also definitely try to reach out to MacKinnnon, Nielsen & Webb. Thanks for the suggestion!

s3alfisc / fwildclusterboot

Is boot_algo3_crv3 in boot_algo3_cpp.cpp not used? #154