microsoft / hummingbird

Hummingbird compiles trained ML models into tensor computation for faster inference.
MIT License
3.34k stars 278 forks source link

XGBoost 2.0.0 breaks tests #732

Open ksaur opened 1 year ago

ksaur commented 1 year ago

XGB 2.0.0 was released last week (Sept 12, 2023). This causes 19 tests to fail.

Many errors are related to NDCG ("NDCG is now the default objective function.")

ksaur commented 1 year ago

Before, I believe they were defaulting to objective="rank:pairwise" but that doesn't seem to fix this. Also, we'll need an upstream fix from onnxmltools for the 1 onnx XBG error.

Started troubleshooting at #733

Pinning this for now.

ksaur commented 10 months ago

onnxmltools was updated but let's wait for their release, then do ours

ksaur commented 9 months ago

With the onnxmltools patch, we now get some new errors: ==== 19 failed, 516 passed, 105 skipped, 115 warnings in 294.32s (0:04:54) ====

Some of them are related to the ndcg change

FAILED ....Check failed: label_is_valid: Relevance degress 
must be lesser than or equal to 31 when the exponential NDCG gain function is used. 
Set `ndcg_exp_gain` to false to use custom DCG gain.

but many are just outright wrong: Mismatched elements: 100 / 100 (100%) ...

ksaur commented 9 months ago

The problem is related to convert_sklearn_xgb_regressor's https://github.com/microsoft/hummingbird/blob/1ae115227b6a8686def78164ec8cb3cad7a0f531/hummingbird/ml/operator_converters/xgb.py#L133

ksaur commented 9 months ago

With xgb==1.7.6, for get_dump()we have '0:[f0<0.84472543] yes=1,no=2,missing=1\n\t1:leaf=-0.0600000024\n\t2:leaf=0.100000009\n',

whereas with xgb==2.0.0 we have '0:leaf=2.36637643e-09\n' from get_dump()

Then at: https://github.com/microsoft/hummingbird/blob/1ae115227b6a8686def78164ec8cb3cad7a0f531/hummingbird/ml/operator_converters/xgb.py#L81

the values for TreeParameters(lefts, rights, features, thresholds, values) are incorrect

ksaur commented 9 months ago

Since it seems they likely changed something internally (now it appears there is a one-node tree rather than a three-node tree), this will take a bit more time. Let's not block the release on this for now.

ksaur commented 8 months ago

TODO: look at https://github.com/onnx/onnxmltools/pull/597/ more closely

ksaur commented 6 months ago

Going forward we can look at #764 as a starting point