simon19891101 commented 4 years ago

Hi there, I'm trying to reduce the inference time of LightGBM (LambdaMART) and was really impressed by the speed increase. However, the predictions of treelite look totally different from the original LightGBM. My script is:

import treelite
import treelite_runtime

###########################
Train and dump LightGBM
###########################
param = {'objective': 'lambdarank',
        'num_iterations': 128,
         'num_leaves': 32,
         'lambda_l1': 0.1, 
         'lambda_l2': 0.1,
         'learning_rate': 0.1
        }

lgbm_lv = train_lgb(lv_train_df, lv_train_y, lv_train_group, param=param, prefix='_lv')
file_name = 'lgbm_lv_'+str(param['num_iterations'])+'_'+str(param['num_leaves'])+'.txt'
lgbm_lv.save_model(file_name)

###########################
Load and apply Treelite LightGBM
###########################

lgbm_lv = lgb.Booster(model_file='lgbm_lv_128_32.txt')

model = treelite.Model.load('lgbm_lv_128_32.txt', model_format='lightgbm')
model.export_lib(toolchain='gcc', 
                 libpath='./lgbm_lv_128_32.so', 
                 params={'parallel_comp': 32},
                 verbose=True)
predictor = treelite_runtime.Predictor('./lgbm_lv_128_32.so', verbose=True)

###########################
Compare predictions
###########################

val_df = scipy.sparse.load_npz('val_df.npz')

lgb_out_pred = lgbm_lv.predict(val_df).reshape(1,-1)

batch = treelite_runtime.Batch.from_csr(val_df)
out_pred = predictor.predict(batch).reshape(1,-1)

with results printed below:

I used lightgbm 3.0.0 and installed the latest treelite via python3 -m pip install --user treelite treelite_runtime. I also tried git clone and installed from source however the predictions are still wrong.

The files ('lgbm_lv_128_32.txt' and 'val_df.npz') to reproduce the results can be accessed here: https://drive.google.com/drive/folders/1K2audmHyzIEHMhCSPX6k3NPduut3OMqr?usp=sharing

It would be awesome to get your thoughts! Thank you very much.

simon19891101 commented 3 years ago

Hi there, I also retrained a model (latest Master) on the Microsoft Learning to Rank dataset (https://www.microsoft.com/en-us/research/project/mslr/). The results between the original model and treelite are almost the same however not 100%:

lgb_out_pred = gbm.predict(val_df).reshape(1,-1)
batch = treelite_runtime.DMatrix(val_df, 'double')
out_pred = predictor.predict(batch).reshape(1,-1)

Screen Shot 2020-11-29 at 8 27 21 PM

When I converted this task to a binary classification by taking labels >= 2 as "1" else "0", the same thing happened though for classifications the predicted labels are treated as the same...

lgb_out_pred = gbm_clf.predict_proba(val_df)[:,1].reshape(1,-1)
batch = treelite_runtime.DMatrix(val_df, 'double')
out_pred = predictor.predict(batch).reshape(1,-1)

Keen to know your thoughts. Thanks a lot :)

hcho3 commented 3 years ago

@simon19891101 Are you using the latest mainline branch of Treelite? The line

batch = treelite_runtime.DMatrix(val_df, 'double')

should be revised to

batch = treelite_runtime.DMatrix(val_df, dtype='float64')

hcho3 commented 3 years ago

227 should fix the issue.

simon19891101 commented 3 years ago

This is great! I tried it and it worked this time! Thank you very much for the help.

dmlc / treelite

Wrong predictions for LightGBM LambdaRank #222

227 should fix the issue.