szilard / GBM-perf

Performance of various open source GBM implementations
MIT License
215 stars 28 forks source link

lightgbm: better matching hyperparams #5

Open guolinke opened 7 years ago

guolinke commented 7 years ago

The h2o and xgboost seems are run by depth-wise with max_depth=10, while lightgbm is run by leaf-wise with max_leaves=1024.

As a result, the speed of lightgbm gpu is not comparable with xgboost and h2o .

szilard commented 7 years ago

It's not a perfect comparison (since you can't match perfectly the parameters), but still it's useful to compare with as close matching as possible.

guolinke commented 7 years ago

okay, you can try to match it by set max_depth=10 in LightGBM .

szilard commented 7 years ago

OK, I'll take a look. I think this option is more recent (at least more recent than my repo :) ).

Laurae2 commented 7 years ago

@szilard max_leaves = 1024 builds very deep trees. With 32 leaves it should already build deeper trees than max_depth = 10 in most cases.

max_depth was available before LightGBM GPU was available I think.

szilard commented 7 years ago

Yeah, prob before GPU feature, I just did not notice. What I meant is that I looked at all features when I created this repo but I hardly had time to keep up with all the great changes you guys are doing :( The above is still sitting on my TODOs...

Laurae2 commented 7 years ago

@szilard Try this: https://github.com/Microsoft/LightGBM/blob/master/docs/Key-Events.md

szilard commented 4 years ago

On c5.9xlarge (18 physical cores, HT cores not used):

1M:

suppressMessages({
library(data.table)
library(ROCR)
library(lightgbm)
library(Matrix)
})

set.seed(123)

d_train <- fread("train-1m.csv", showProgress=FALSE)
d_test <- fread("test.csv", showProgress=FALSE)

d_all <- rbind(d_train, d_test)
d_all$dep_delayed_15min <- ifelse(d_all$dep_delayed_15min=="Y",1,0)

d_all_wrules <- lgb.convert_with_rules(d_all)       
d_all <- d_all_wrules$data
cols_cats <- names(d_all_wrules$rules) 

d_train <- d_all[1:nrow(d_train)]
d_test <- d_all[(nrow(d_train)+1):(nrow(d_train)+nrow(d_test))]

p <- ncol(d_all)-1
dlgb_train <- lgb.Dataset(data = as.matrix(d_train[,1:p]), label = d_train$dep_delayed_15min)

auc <- function() {
  phat <- predict(md, data = as.matrix(d_test[,1:p]))
  rocr_pred <- prediction(phat, d_test$dep_delayed_15min)
  cat(performance(rocr_pred, "auc")@y.values[[1]],"\n")
}
system.time({
  md <- lgb.train(data = dlgb_train, 
            objective = "binary", 
            nrounds = 100, num_leaves = 512, learning_rate = 0.1, 
            categorical_feature = cols_cats,
            verbose = 2)
})
auc()
[LightGBM] [Info] Number of positive: 192982, number of negative: 807018
[LightGBM] [Debug] Dataset::GetMultiBinFromAllFeatures: sparse rate 0.000818
[LightGBM] [Debug] init for col-wise cost 0.000007 seconds, init for row-wise cost 0.004314 seconds
[LightGBM] [Debug] col-wise cost 0.006305 seconds, row-wise cost 0.000557 seconds
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.006870 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Debug] Using Dense Multi-Val Bin
[LightGBM] [Info] Total Bins 1095
[LightGBM] [Info] Number of data points in the train set: 1000000, number of used features: 8
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.192982 -> initscore=-1.430749
[LightGBM] [Info] Start training from score -1.430749
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 16
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 16
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 17
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 15
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 16
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 16
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 17
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 17
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 16
....
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 22
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 21
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 21
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 24
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 21
   user  system elapsed
 55.159   0.228   3.126
> auc()
0.7650181
system.time({
  md <- lgb.train(data = dlgb_train, 
            objective = "binary", 
            nrounds = 100, num_leaves = 2**10, learning_rate = 0.1, 
            categorical_feature = cols_cats,
            verbose = 2)
})
auc()
[LightGBM] [Info] Number of positive: 192982, number of negative: 807018
[LightGBM] [Debug] Dataset::GetMultiBinFromAllFeatures: sparse rate 0.000818
[LightGBM] [Debug] init for col-wise cost 0.000008 seconds, init for row-wise cost 0.004395 seconds
[LightGBM] [Debug] col-wise cost 0.007861 seconds, row-wise cost 0.000543 seconds
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008412 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Debug] Using Dense Multi-Val Bin
[LightGBM] [Info] Total Bins 1095
[LightGBM] [Info] Number of data points in the train set: 1000000, number of used features: 8
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.192982 -> initscore=-1.430749
[LightGBM] [Info] Start training from score -1.430749
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 20
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 22
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 18
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 19
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 19
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 20
...
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 25
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 27
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 24
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 22
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 29
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 24
   user  system elapsed
 95.340   0.432   5.327
> auc()
0.7712053
system.time({
  md <- lgb.train(data = dlgb_train, 
            objective = "binary", 
            nrounds = 100, max_depth = 10, num_leaves = 2**10, learning_rate = 0.1, 
            categorical_feature = cols_cats,
            verbose = 2)
})
auc()
[LightGBM] [Info] Number of positive: 192982, number of negative: 807018
[LightGBM] [Debug] Dataset::GetMultiBinFromAllFeatures: sparse rate 0.000818
[LightGBM] [Debug] init for col-wise cost 0.000008 seconds, init for row-wise cost 0.004462 seconds
[LightGBM] [Debug] col-wise cost 0.008065 seconds, row-wise cost 0.000628 seconds
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.008701 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Debug] Using Dense Multi-Val Bin
[LightGBM] [Info] Total Bins 1095
[LightGBM] [Info] Number of data points in the train set: 1000000, number of used features: 8
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.192982 -> initscore=-1.430749
[LightGBM] [Info] Start training from score -1.430749
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 876 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 895 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 910 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 901 and max_depth = 10
...
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 650 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 745 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 672 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 759 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 649 and max_depth = 10
   user  system elapsed
 52.740   0.216   2.947
> auc()
0.7614953
szilard commented 4 years ago

10M:

> system.time({
+   md <- lgb.train(data = dlgb_train,
+             objective = "binary",
+             nrounds = 100, num_leaves = 512, learning_rate = 0.1,
+             categorical_feature = cols_cats,
+             verbose = 2)
+ })
[LightGBM] [Info] Number of positive: 1927804, number of negative: 8072196
[LightGBM] [Debug] Dataset::GetMultiBinFromAllFeatures: sparse rate 0.000843
[LightGBM] [Debug] init for col-wise cost 0.000163 seconds, init for row-wise cost 0.101055 seconds
[LightGBM] [Debug] col-wise cost 0.063312 seconds, row-wise cost 0.006002 seconds
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.069477 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Debug] Using Dense Multi-Val Bin
[LightGBM] [Info] Total Bins 1095
[LightGBM] [Info] Number of data points in the train set: 10000000, number of used features: 8
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.192780 -> initscore=-1.432044
[LightGBM] [Info] Start training from score -1.432044
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 18
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 18
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 16
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 20
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 16
...
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 17
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 25
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 19
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 24
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 21
[LightGBM] [Debug] Trained a tree with leaves = 512 and max_depth = 21
   user  system elapsed
280.078   0.531  16.011
> auc()
0.792273
> system.time({
+   md <- lgb.train(data = dlgb_train,
+             objective = "binary",
+             nrounds = 100, num_leaves = 2**10, learning_rate = 0.1,
+             categorical_feature = cols_cats,
+             verbose = 2)
+ })
[LightGBM] [Info] Number of positive: 1927804, number of negative: 8072196
[LightGBM] [Debug] Dataset::GetMultiBinFromAllFeatures: sparse rate 0.000843
[LightGBM] [Debug] init for col-wise cost 0.000165 seconds, init for row-wise cost 0.100543 seconds
[LightGBM] [Debug] col-wise cost 0.064476 seconds, row-wise cost 0.007202 seconds
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.071844 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Debug] Using Dense Multi-Val Bin
[LightGBM] [Info] Total Bins 1095
[LightGBM] [Info] Number of data points in the train set: 10000000, number of used features: 8
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.192780 -> initscore=-1.432044
[LightGBM] [Info] Start training from score -1.432044
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 19
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 20
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 19
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 22
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 21
...
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 27
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 32
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 27
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 24
[LightGBM] [Debug] Trained a tree with leaves = 1024 and max_depth = 22
   user  system elapsed
360.052   0.859  20.310
> auc()
0.8018694
> system.time({
+   md <- lgb.train(data = dlgb_train,
+             objective = "binary",
+             nrounds = 100, max_depth = 10, num_leaves = 2**10, learning_rate = 0.1,
+             categorical_feature = cols_cats,
+             verbose = 2)
+ })
[LightGBM] [Info] Number of positive: 1927804, number of negative: 8072196
[LightGBM] [Debug] Dataset::GetMultiBinFromAllFeatures: sparse rate 0.000843
[LightGBM] [Debug] init for col-wise cost 0.000133 seconds, init for row-wise cost 0.102343 seconds
[LightGBM] [Debug] col-wise cost 0.063263 seconds, row-wise cost 0.009728 seconds
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.073124 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Debug] Using Dense Multi-Val Bin
[LightGBM] [Info] Total Bins 1095
[LightGBM] [Info] Number of data points in the train set: 10000000, number of used features: 8
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.192780 -> initscore=-1.432044
[LightGBM] [Info] Start training from score -1.432044
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 956 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 952 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 939 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 965 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 930 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 965 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 952 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 950 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
...
[LightGBM] [Debug] Trained a tree with leaves = 683 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 671 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 878 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 663 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 669 and max_depth = 10
   user  system elapsed
272.774   0.592  15.449
> auc()
0.7814164
szilard commented 4 years ago

Summary:

10M:

num_leaves = 512,
   user  system elapsed
280.078   0.531  16.011
> auc()
0.792273

num_leaves = 2**10,
   user  system elapsed
360.052   0.859  20.310
> auc()
0.8018694

max_depth = 10, num_leaves = 2**10, 
   user  system elapsed
272.774   0.592  15.449
> auc()
0.7814164

Looks like Setup 1 and 3 have similar runtimes, but Setup 1 has better AUC. I think this is expected as imposing max_depth sets an extra (apparently unnecessary) constraint on the optimizer. What do you think @guolinke ?

Btw I've been using Setup 1 in the benchmark all along.

szilard commented 4 years ago

Btw setting max_depth=10 and num_leaves=2**17 (largest value possible) give exactly the same results as max_depth=10 and num_leaves=2**10 above (because no tree had more leaves than 2^10):

> system.time({
+   md <- lgb.train(data = dlgb_train,
+             objective = "binary",
+             nrounds = 100, max_depth = 10, num_leaves = 2**17+1, learning_rate = 0.1,
+             categorical_feature = cols_cats,
+             verbose = 2)
+ })
Error in lgb.call("LGBM_DatasetCreateFromMat_R", ret = handle, private$raw_data,  :
  [LightGBM] [Fatal] Check failed: (num_leaves) <= (131072) at /tmp/RtmpExEKbv/R.INSTALL7b7298eca1/lightgbm/src/src/io/config_auto.cpp, line 319 .
> system.time({
+   md <- lgb.train(data = dlgb_train,
+             objective = "binary",
+             nrounds = 100, max_depth = 10, num_leaves = 2**17, learning_rate = 0.1,
+             categorical_feature = cols_cats,
+             verbose = 2)
+ })
[LightGBM] [Info] Number of positive: 1927804, number of negative: 8072196
[LightGBM] [Debug] Dataset::GetMultiBinFromAllFeatures: sparse rate 0.000843
[LightGBM] [Debug] init for col-wise cost 0.000161 seconds, init for row-wise cost 0.102629 seconds
[LightGBM] [Debug] col-wise cost 0.051071 seconds, row-wise cost 0.006251 seconds
[LightGBM] [Warning] Auto-choosing row-wise multi-threading, the overhead of testing was 0.057483 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Debug] Using Dense Multi-Val Bin
[LightGBM] [Info] Total Bins 1095
[LightGBM] [Info] Number of data points in the train set: 10000000, number of used features: 8
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.192780 -> initscore=-1.432044
[LightGBM] [Info] Start training from score -1.432044
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 956 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 952 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 939 and max_depth = 10
...
[LightGBM] [Debug] Trained a tree with leaves = 878 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 663 and max_depth = 10
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Debug] Trained a tree with leaves = 669 and max_depth = 10
   user  system elapsed
278.384   0.614  16.010
> auc()
0.7814164