Open oChristineo opened 5 years ago
ThunderGBM uses GPUs by default. The test_dataset.txt
data set is quick small, and should not be so slow. You may try to train the model with ThunderGBM using the command line.
It would be good if you could share your script on running ThunderGBM, so that we can help you identify the potential problems the script might have.
Thank you very much for quickly replying.
Here is my code: from thundergbm import from xgboost import from sklearn.metrics import r2_score from sklearn.datasets import * from sklearn.metrics import mean_squared_error from math import sqrt import time
x,y = load_svmlight_file("E:/CUDA code/thundergbm/dataset/test_dataset.txt") TGBMmodel = TGBMRegressor(tree_method='hist') XGBmodel = XGBRegressor(tree_method='hist')
start1 = time.time() TGBMmodel.fit(x,y) end1 = time.time() TGBMduration = end1 - start1
start2 = time.time() XGBmodel.fit(x,y) end2 = time.time() XGBduration = end2 - start2
print('TGBM elapsed time:{:.4f}s'.format(TGBMduration)) print('XGB elapsed time:{:.4f}s'.format(XGBduration))
x2,y2=load_svmlight_file("E:/CUDA code/thundergbm/dataset/test_dataset.txt") y_predict1=TGBMmodel.predict(x2) y_predict2=XGBmodel.predict(x2) rms1 = sqrt(mean_squared_error(y2, y_predict1)) print("TGBM RMS: %f" % rms1) rms2 = sqrt(mean_squared_error(y2, y_predict2)) print("XGB RMS: %f" % rms2)
accuracy1 = r2_score(y2, y_predict1) print("TGBM Accuracy: %.2f%%" % (accuracy1 100.0)) accuracy2 = r2_score(y2, y_predict2) print("XGB Accuracy: %.2f%%" % (accuracy2 100.0))
I trained the model in Spyder and the result is: TGBM elapsed time:1.5382s XGB elapsed time:0.1114s TGBM RMS: 0.489562 XGB RMS: 0.603319 TGBM Accuracy: 67.71% XGB Accuracy: 50.95%
I try to train the model using the command line,the result is : TGBM elapsed time:2.7293s XGB elapsed time:0.1688s the time is slower than I trained the model in Spyder. I don't know why.
The default parameters of XGBRegressor
, which you used, are as follow according to the documentation of XGBoost.
max_depth=3, learning_rate=1, n_estimators=100, objective="reg:linear" ...
The tree_method of XGBoost should be gpu_hist
if you want to run it on the GPUs.
In comparison, the default parameters of TGBMRegressor are as follow.
max_depth=6, learning_rate=1, n_estimators=40, objective="reg:linear" ...
So, the experimental results of your script are unfair to ThunderGBM. It would be better if you read the parameters documents of two libraries first.
Thank you very much for your replying and suggestion. Thundergbm is indeed faster than xgboost after modifying the parameters. But I failed with setting parameters 'max_depth' and 'n_estimators' for thundergbm:
TypeError: init() got an unexpected keyword argument ' max_depth' and 'n_estimators'
It will be ok if I set the parameters 'depth' and 'n_trees'.
I trained the test_dataset.txt under thundergbm/dataset with thundergbm, but the execution time was 1-2 seconds slower than xgboost. I don't know if it's because I didn't use the GPU?(Windows 10, NVIDIA GETFORCE GTX 1050,CUDA 10.1)