microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.65k stars 3.83k forks source link

Large bagging is very slow #628

Closed Laurae2 closed 7 years ago

Laurae2 commented 7 years ago

Bagging is very slow. I am not sure what is causing it. See https://github.com/Microsoft/LightGBM/issues/562 for the dataset. I am using 0.40 subsampling to have this issue, it is not reproducible when subsampling is 0.60. I think bagging uses only 1 core, but I don't see this issue when using 0.60.

Using DLL compiled with Visual Studio 2017.

image

guolinke commented 7 years ago

can you change 0.5 in this line : https://github.com/Microsoft/LightGBM/blob/master/src/boosting/gbdt.cpp#L150, to 1e-6 , and try again?

guolinke commented 7 years ago

can you also try the 'bagging' branch ?

Laurae2 commented 7 years ago

@guolinke Switching to 1e-6 seems to fix the issue.

On bagging branch, it was the same (branch is gone now?).

guolinke commented 7 years ago

@Laurae2 can you try the latest master branch?

guolinke commented 7 years ago

@Laurae2 I add some timing output in the "bagging" branch. Can you run with it and paste the logs (enable verbose) ?

Laurae2 commented 7 years ago

Here some logs. I think some logs are out of place, no idea why. I'll retry with CLI.

> Laurae::timer_func_print({model <- lgb.train(params = list(objective = "binary",
+                                                            metric = "auc",
+                                                            bin_construct_sample_cnt = 2250000L,
+                                                            early_stopping_round = 25),
+                                              train,
+                                              5,
+                                              list(test = test),
+                                              verbose = 2)})
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[1]:    test's auc:0.501172 
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[2]:    test's auc:0.501379 
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=8
[3]:    test's auc:0.502558 
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=12
[4]:    test's auc:0.502981 
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=10
[5]:    test's auc:0.50398 
The function ran in 37827.132 milliseconds.
[1] 37827.13
> rm(model)
> gc()
[LightGBM] [Info] GBDT::boosting costs 0.027171
[LightGBM] [Info] GBDT::train_score costs 0.012051
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.000001
[LightGBM] [Info] GBDT::valid_score costs 0.006769
[LightGBM] [Info] GBDT::metric costs 0.000000
[LightGBM] [Info] GBDT::bagging costs 0.000003
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 27.997018
[LightGBM] [Info] SerialTreeLearner::init_train costs 2.393088
[LightGBM] [Info] SerialTreeLearner::init_split costs 12.377513
[LightGBM] [Info] SerialTreeLearner::hist_build costs 10.837631
[LightGBM] [Info] SerialTreeLearner::find_split costs 2.226329
[LightGBM] [Info] SerialTreeLearner::split costs 0.070301
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 14.763519
          used (Mb) gc trigger (Mb) max used (Mb)
Ncells  692175 37.0    1168576 62.5  1168576 62.5
Vcells 3516642 26.9    5133766 39.2  4078954 31.2
> Laurae::timer_func_print({model <- lgb.train(params = list(objective = "binary",
+                                                            metric = "auc",
+                                                            bin_construct_sample_cnt = 2250000L,
+                                                            early_stopping_round = 25,
+                                                            bagging_freq = 1,
+                                                            bagging_seed = 1,
+                                                            bagging_fraction = 0.6),
+                                              train,
+                                              5,
+                                              list(test = test),
+                                              verbose = 2)})
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=12
[1]:    test's auc:0.500272 
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[2]:    test's auc:0.500702 
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[3]:    test's auc:0.501856 
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[4]:    test's auc:0.503777 
[LightGBM] [Debug] Re-bagging, using 1350000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=10
[5]:    test's auc:0.50587 
The function ran in 24566.072 milliseconds.
[1] 24566.07
> 
> 
> rm(model)
> gc()
[LightGBM] [Info] GBDT::boosting costs 0.079639
[LightGBM] [Info] GBDT::train_score costs 0.025451
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.065390
[LightGBM] [Info] GBDT::valid_score costs 0.016837
[LightGBM] [Info] GBDT::metric costs 0.000000
[LightGBM] [Info] GBDT::bagging costs 0.013739
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 50.325836
[LightGBM] [Info] SerialTreeLearner::init_train costs 6.186171
[LightGBM] [Info] SerialTreeLearner::init_split costs 20.463570
[LightGBM] [Info] SerialTreeLearner::hist_build costs 18.954661
[LightGBM] [Info] SerialTreeLearner::find_split costs 4.454569
[LightGBM] [Info] SerialTreeLearner::split costs 0.111960
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 26.636777
          used (Mb) gc trigger (Mb) max used (Mb)
Ncells  694113 37.1    1168576 62.5  1168576 62.5
Vcells 3522613 26.9    6240519 47.7  4383248 33.5
> Laurae::timer_func_print({model <- lgb.train(params = list(objective = "binary",
+                                                            metric = "auc",
+                                                            bin_construct_sample_cnt = 2250000L,
+                                                            early_stopping_round = 25,
+                                                            bagging_freq = 1,
+                                                            bagging_seed = 1,
+                                                            bagging_fraction = 0.4),
+                                              train,
+                                              5,
+                                              list(test = test),
+                                              verbose = 2)})
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Debug] use subset for bagging
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=11
[1]:    test's auc:0.501405 
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=13
[2]:    test's auc:0.502849 
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=10
[3]:    test's auc:0.504528 
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=9
[4]:    test's auc:0.506207 
[LightGBM] [Debug] Re-bagging, using 900000 data to train
[LightGBM] [Info] Trained a tree with leaves=31 and max_depth=13
[5]:    test's auc:0.506727 
The function ran in 90240.890 milliseconds.
[1] 90240.89
> rm(model)
> gc()
[LightGBM] [Info] GBDT::boosting costs 0.165529
[LightGBM] [Info] GBDT::train_score costs 0.124710
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.065391
[LightGBM] [Info] GBDT::valid_score costs 0.023227
[LightGBM] [Info] GBDT::metric costs 0.000000
[LightGBM] [Info] GBDT::bagging costs 76.685486
[LightGBM] [Info] GBDT::bagging_subset_time costs 28.801937
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 47.856741
[LightGBM] [Info] GBDT::sub_gradient costs 0.007842
[LightGBM] [Info] GBDT::tree costs 61.569484
[LightGBM] [Info] SerialTreeLearner::init_train costs 7.088206
[LightGBM] [Info] SerialTreeLearner::init_split costs 26.233300
[LightGBM] [Info] SerialTreeLearner::hist_build costs 21.110779
[LightGBM] [Info] SerialTreeLearner::find_split costs 6.784831
[LightGBM] [Info] SerialTreeLearner::split costs 0.141303
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 33.290067
          used (Mb) gc trigger (Mb) max used (Mb)
Ncells  694853 37.2    1168576 62.5  1168576 62.5
Vcells 3523226 26.9    6240519 47.7  4389355 33.5
Laurae2 commented 7 years ago

@guolinke Better logs below:

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=1.0 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.466967 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669789
[LightGBM] [Info] 3.083114 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65686
[LightGBM] [Info] 6.534374 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649763
[LightGBM] [Info] 9.767502 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645923
[LightGBM] [Info] 12.838336 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.643904
[LightGBM] [Info] 15.916650 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.028511
[LightGBM] [Info] GBDT::train_score costs 0.009519
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.000001
[LightGBM] [Info] GBDT::valid_score costs 0.002431
[LightGBM] [Info] GBDT::metric costs 0.005669
[LightGBM] [Info] GBDT::bagging costs 0.000002
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 15.870458
[LightGBM] [Info] SerialTreeLearner::init_train costs 2.178705
[LightGBM] [Info] SerialTreeLearner::init_split costs 3.308739
[LightGBM] [Info] SerialTreeLearner::hist_build costs 9.902303
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.456796
[LightGBM] [Info] SerialTreeLearner::split costs 0.021500
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 5.481398

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.6 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 6.990656 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669724
[LightGBM] [Info] 3.145600 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.656882
[LightGBM] [Info] 5.911323 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649771
[LightGBM] [Info] 8.587378 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645973
[LightGBM] [Info] 11.349179 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.64391
[LightGBM] [Info] 14.342477 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.027847
[LightGBM] [Info] GBDT::train_score costs 0.008080
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.015252
[LightGBM] [Info] GBDT::valid_score costs 0.002572
[LightGBM] [Info] GBDT::metric costs 0.003098
[LightGBM] [Info] GBDT::bagging costs 0.013149
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 14.272429
[LightGBM] [Info] SerialTreeLearner::init_train costs 3.820590
[LightGBM] [Info] SerialTreeLearner::init_split costs 2.218611
[LightGBM] [Info] SerialTreeLearner::hist_build costs 7.742144
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.451989
[LightGBM] [Info] SerialTreeLearner::split costs 0.017599
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 6.033382

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.4 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.311082 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669779
[LightGBM] [Info] 15.066362 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65695
[LightGBM] [Info] 32.950422 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649819
[LightGBM] [Info] 47.337700 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645965
[LightGBM] [Info] 60.975131 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.64389
[LightGBM] [Info] 73.995944 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.027556
[LightGBM] [Info] GBDT::train_score costs 0.034221
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.000001
[LightGBM] [Info] GBDT::valid_score costs 0.002442
[LightGBM] [Info] GBDT::metric costs 0.003057
[LightGBM] [Info] GBDT::bagging costs 68.762191
[LightGBM] [Info] GBDT::bagging_subset_time costs 28.349320
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 40.400013
[LightGBM] [Info] GBDT::sub_gradient costs 0.007574
[LightGBM] [Info] GBDT::tree costs 5.158859
[LightGBM] [Info] SerialTreeLearner::init_train costs 0.882581
[LightGBM] [Info] SerialTreeLearner::init_split costs 1.307824
[LightGBM] [Info] SerialTreeLearner::hist_build costs 2.537724
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.419427
[LightGBM] [Info] SerialTreeLearner::split costs 0.008666
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 2.188091
Laurae2 commented 7 years ago

@guolinke This is with 1e-6 fix:

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=1.0 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.291155 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669789
[LightGBM] [Info] 3.088217 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65686
[LightGBM] [Info] 6.512720 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649763
[LightGBM] [Info] 9.766369 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645923
[LightGBM] [Info] 12.856576 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.643904
[LightGBM] [Info] 15.938305 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.031661
[LightGBM] [Info] GBDT::train_score costs 0.009531
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.000001
[LightGBM] [Info] GBDT::valid_score costs 0.002449
[LightGBM] [Info] GBDT::metric costs 0.003105
[LightGBM] [Info] GBDT::bagging costs 0.000002
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 15.891500
[LightGBM] [Info] SerialTreeLearner::init_train costs 2.169989
[LightGBM] [Info] SerialTreeLearner::init_split costs 3.359920
[LightGBM] [Info] SerialTreeLearner::hist_build costs 9.847964
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.481699
[LightGBM] [Info] SerialTreeLearner::split costs 0.029584
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 5.522944

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.6 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.777340 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669724
[LightGBM] [Info] 2.760088 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.656882
[LightGBM] [Info] 5.420121 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649771
[LightGBM] [Info] 7.992326 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645973
[LightGBM] [Info] 10.777746 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.64391
[LightGBM] [Info] 13.426416 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.026495
[LightGBM] [Info] GBDT::train_score costs 0.008024
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.015815
[LightGBM] [Info] GBDT::valid_score costs 0.002523
[LightGBM] [Info] GBDT::metric costs 0.003107
[LightGBM] [Info] GBDT::bagging costs 0.012842
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 13.357567
[LightGBM] [Info] SerialTreeLearner::init_train costs 3.719762
[LightGBM] [Info] SerialTreeLearner::init_split costs 1.805211
[LightGBM] [Info] SerialTreeLearner::hist_build costs 7.380577
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.435972
[LightGBM] [Info] SerialTreeLearner::split costs 0.014171
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 5.518769

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.4 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.265347 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669779
[LightGBM] [Info] 2.088153 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65695
[LightGBM] [Info] 4.460803 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649819
[LightGBM] [Info] 6.536185 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645965
[LightGBM] [Info] 8.671731 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.64389
[LightGBM] [Info] 10.821296 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training
[LightGBM] [Info] GBDT::boosting costs 0.030278
[LightGBM] [Info] GBDT::train_score costs 0.006794
[LightGBM] [Info] GBDT::out_of_bag_score costs 0.022166
[LightGBM] [Info] GBDT::valid_score costs 0.002548
[LightGBM] [Info] GBDT::metric costs 0.003119
[LightGBM] [Info] GBDT::bagging costs 0.013182
[LightGBM] [Info] GBDT::bagging_subset_time costs 0.000000
[LightGBM] [Info] GBDT::reset_tree_learner_time costs 0.000000
[LightGBM] [Info] GBDT::sub_gradient costs 0.000000
[LightGBM] [Info] GBDT::tree costs 10.743162
[LightGBM] [Info] SerialTreeLearner::init_train costs 3.490598
[LightGBM] [Info] SerialTreeLearner::init_split costs 1.424444
[LightGBM] [Info] SerialTreeLearner::hist_build costs 5.369412
[LightGBM] [Info] SerialTreeLearner::find_split costs 0.439387
[LightGBM] [Info] SerialTreeLearner::split costs 0.016834
[LightGBM] [Info] SerialTreeLearner::ordered_bin costs 4.910205
guolinke commented 7 years ago

@Laurae2 Thanks for the help. I think the latest master branch have fixed this.

Laurae2 commented 7 years ago

@guolinke I am getting segmentation fault instead now.

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=1.0 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.356559 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
[LightGBM] [Info] Finished initializing training
[LightGBM] [Info] Started training...
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:1, valid_1 binary_logloss : 0.669789
[LightGBM] [Info] 3.090586 seconds elapsed, finished iteration 1
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:2, valid_1 binary_logloss : 0.65686
[LightGBM] [Info] 6.522369 seconds elapsed, finished iteration 2
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:3, valid_1 binary_logloss : 0.649763
[LightGBM] [Info] 9.768736 seconds elapsed, finished iteration 3
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:4, valid_1 binary_logloss : 0.645923
[LightGBM] [Info] 12.862355 seconds elapsed, finished iteration 4
[LightGBM] [Info] Trained a tree with leaves=7 and max_depth=3
[LightGBM] [Info] Iteration:5, valid_1 binary_logloss : 0.643904
[LightGBM] [Info] 15.948677 seconds elapsed, finished iteration 5
[LightGBM] [Info] Finished training

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.6 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.292828 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
Segmentation fault

Lolo@Laurae MINGW64 /e/lightgbm
$ E:/lightgbm/lightgbm.exe data=../benchmark_lot/data/reput_train_lgb_na.data num_threads=8 learning_rate=0.25 max_depth=3 num_leaves=7 min_gain_to_split=1 min_sum_hessian_in_leaf=1 min_data_in_leaf=1 num_trees=5 metric=binary_logloss bagging_freq=1 bagging_seed=1 bagging_fraction=0.4 test=../benchmark_lot/data/reput_test_lgb_na.data app=binary
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Finished loading data in 7.078148 seconds
[LightGBM] [Info] Number of positive: 742198, number of negative: 1507802
[LightGBM] [Info] Total Bins 6027180
[LightGBM] [Info] Number of data: 2250000, number of used features: 23636
Segmentation fault
guolinke commented 7 years ago

@Laurae2 sorry, it has a bug. I just use "push -f" to fix it.

github-actions[bot] commented 1 year ago

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.