grf-labs / grf

Generalized Random Forests
https://grf-labs.github.io/grf/
GNU General Public License v3.0
957 stars 250 forks source link

Invariance test fails on Apple arm+clang #1452

Closed erikcs closed 1 day ago

erikcs commented 1 week ago

On my m3 Sonoma, grf compiled with clang (1600.0.26.3) fails this invariance test.[^1]

This is a bit absurd considering this is an invariance that passes on Azure with x86-based clang or gcc. It also passes if my m3 Sonoma macbook compiles grf with gcc-14 instead of clang.

The code in that invariance test rely on Eigen, maybe there is something strange going on with arm-clang + Eigen, or just clang. @jtibshirani what do you say, it this something we should pass on to a clang dev?

[^1]: It tests a property of a "complicated" forest algorithm which basically boils down to: "if I give you the duplicated stacked outcomes (Y,Y) the results is the same as if I'd given you just the single outcome (Y)"

jtibshirani commented 1 week ago

That's indeed surprising! If you have the time, seems like a good thing to debug and pass on to whatever component ends up containing the issue. To help debug, you could work on reducing the scope -- for example can we reproduce the issue in a C++ only set-up, to rule out anything related to R?

jtibshirani commented 1 week ago

One guess: we "vendored in" our own random number generator (https://github.com/grf-labs/grf/pull/469). But that was before ARM was widely available ... I wonder if we need to take another look at that dependency and update the files?

erikcs commented 1 day ago

False alarm(!), now that I've reloaded all the grf machinery into my brain I realized I've been a bit too naive in some invariance expectations. The test in question sometimes have different splits (the random samples drawn + everything else randomness related is luckily the same across architectures), and that is actually legal behavior. It's the same old CART + non-commutative computer instructions headache we all know by now. For some two variables x1 or x2, then I can get the same criterion by splitting on either one. Which one I end up splitting on depends on 1) which appears first in my sort order or 2) floating point behavior when calculating objective = a + b. In this case arm decides that b + a is slightly larger than a + b, and splits on x2 instead of x1. If we ever implement a new random forest from scratch it would be nice to think of general ways to translate algorithms into software that behaves exactly the same across all system flavors 🤔. For CART splitting in particular that would mean coming up with ways to ensure the split decision is identical regardless of operation order, which sounds challenging .I'll make some test updates and close this issue.