Fix intermittent failure in CPU test

triton-inference-server / fil_backend

FIL backend for the Triton Inference Server

Apache License 2.0

71 stars 36 forks source link

Fix intermittent failure in CPU test #105

Closed hcho3 closed 3 years ago

hcho3 commented 3 years ago

Closes #104

wphicks commented 3 years ago

I'm not sure we just want to switch everything over to predict_proba since we want to explicitly exercise the code paths and configurations for both class output and probability output. No need to rush a fix since this is an extremely intermittent issue; let's just follow the strategy used in cuML and demand that 99% (or pick your threshold) of class outputs match. Alternatively (and maybe even better), we could compare directly to a local run of GTIL rather than FIL in order to (theoretically) guarantee a match.

hcho3 commented 3 years ago

@wphicks I was quite nervous about allowing a mismatch in class outputs, because then we can't differentiate between tiny difference (49.6% vs 50.4%) and significant difference (80% vs 20%). However, running GTIL locally seems like a reasonable path.

hcho3 commented 3 years ago

As it turns out, running GTIL inside test_model.py is proving to be quite tricky. This is because how the flags differ between FIL (predict_proba) and GTIL (pred_transform). The difference is resolved with sophisticated glue logic in src/gtil_utils.cu, but it's hard to do that in a Python script.

For now, let's settle with checking 99% match in the class outputs. In the long term, we should add a new API to Treelite that exposes the predict_proba flag that behaves the same way as predict_proba in FIL.

wphicks commented 3 years ago

@hcho3 If you approve of #115, I think we should be able to close this in favor of that, right?

hcho3 commented 3 years ago

Closing in favor of #115