Closed szilard closed 4 years ago
GPU p3.2xlarge (Tesla V100):
Same code just add device = "gpu"
to the training function:
md <- lgb.train(data = dlgb_train,
nrounds = 100, num_leaves = 512, learning_rate = 0.1,
device = "gpu",
Full code here: https://github.com/szilard/GBM-perf/tree/master/wip-testing/lightgbm-catenc/gpu
Timings [sec] and AUC:
0.1m:
lightgbm OHE 8.565 0.7301211
lightgbm catenc 8.665 0.7155734
1m:
lightgbm OHE 14.461 0.766018
lightgbm catenc 11.929 0.7676921
10m:
lightgbm OHE 68.609 0.7749303
lightgbm catenc 50.024 0.7926504
another run:
0.1m:
lightgbm OHE 8.565 0.7301271
lightgbm catenc 8.285 0.7155633
1m:
lightgbm OHE 14.855 0.7659496
lightgbm catenc 12.595 0.7632631
10m:
lightgbm OHE 67.979 0.7748983
lightgbm catenc 49.798 0.7925457
Changed benchmark to use cat.enc. instead of OHE (This is in line with the other tools, h2o and catboost using their own cat.encodings while xgboost falls back to OHE).
Commit here: https://github.com/szilard/GBM-perf/commit/e774fcceec44f16c58003ec09a2e56cc832dd113
Last README before the commit:
https://github.com/szilard/GBM-perf/blob/c5d3bf7b8433ea62c09548dbd810998336097a8f/README.md
Tables with the results having both for easy comparison:
Instead of OHE (with sparse.model.matrix) we can use lightgbm's special encoding in which the data is stored as integers but it it treated as categorical:
OHE:
cat.enc:
The main diff:
OHE:
cat.enc:
Full code here: https://github.com/szilard/GBM-perf/tree/master/wip-testing/lightgbm-catenc/cpu/run
Timings [sec] and AUC:
CPU r4.8xlarge: