Closed hcho3 closed 3 years ago
I'm not sure we just want to switch everything over to predict_proba
since we want to explicitly exercise the code paths and configurations for both class output and probability output. No need to rush a fix since this is an extremely intermittent issue; let's just follow the strategy used in cuML and demand that 99% (or pick your threshold) of class outputs match. Alternatively (and maybe even better), we could compare directly to a local run of GTIL rather than FIL in order to (theoretically) guarantee a match.
@wphicks I was quite nervous about allowing a mismatch in class outputs, because then we can't differentiate between tiny difference (49.6% vs 50.4%) and significant difference (80% vs 20%). However, running GTIL locally seems like a reasonable path.
As it turns out, running GTIL inside test_model.py
is proving to be quite tricky. This is because how the flags differ between FIL (predict_proba
) and GTIL (pred_transform
). The difference is resolved with sophisticated glue logic in src/gtil_utils.cu
, but it's hard to do that in a Python script.
For now, let's settle with checking 99% match in the class outputs. In the long term, we should add a new API to Treelite that exposes the predict_proba
flag that behaves the same way as predict_proba
in FIL.
@hcho3 If you approve of #115, I think we should be able to close this in favor of that, right?
Closing in favor of #115
Closes #104