TAOinfoWr commented 4 years ago

I have same problem about out of memory.According to my observation, it seems related to model complexity(ex: depth or n_trees too large ? ). is there any method to solve it?

The follow is my code and os/gpu information and error message.

from thundergbm import TGBMClassifier from sklearn import datasets import numpy as np dim=9 row_num=30000 X=np.random.random((row_num,dim)) X = X.astype('float32') y=np.random.randint(0,7,row_num) y = y.astype('int32') clf = TGBMClassifier(depth=18, n_trees=1000,verbose=0,bagging=0) clf.fit(X, y)

------------------------------------------------------------------------

Ubuntu 16.04 NVIDIA driver : 440.44 CUDA Version: 10.2 GPU - GeForce RTX 2080

------------------------------------------------------------------------

Error message 1. 2020-04-09 18:12:35,082 FATAL [default] Check failed: [error == cudaSuccess] out of memory 2020-04-09 18:12:35,082 WARNING [default] Aborting application. Reason: Fatal log at [/home/admin/Desktop/EverComm_ibpem_gpu/thundergbm/src/thundergbm/syncmem.cpp:107] Aborted (core dumped)

another message (depth=20)

2020-04-09 18:41:17,156 FATAL [default] Check failed: [size() == source.size()] destination and source count doesn't match 2020-04-09 18:41:17,156 WARNING [default] Aborting application. Reason: Fatal log at [/home/admin/Desktop/EverComm_ibpem_gpu/thundergbm/include/thundergbm/syncarray.h:91] Aborted (core dumped)

TAOinfoWr commented 4 years ago

it's my gpu detail information

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

Kurt-Liuhf commented 4 years ago

Hi @TAOinfoWr, the error of out-of-memory is due to the large parameters (i.e, the tree depth and the number of trees) you set. When you use some parameters like these, the training process of GBDT is really memory consuming, especially for a classification task. We recommend you to try training ThunderGBM with a smaller tree depth and a smaller number of trees. Thank you.

TAOinfoWr commented 4 years ago

Hi @Kurt-Liuhf ，thank you for your response. I know what you mean, but in same case, the larger value of the hyperpararmeter(ie, tree depth) has a better predictive performance. if it could improve gpu memory optimization in future,i think it woueld be very great. Thank you!.

Xtra-Computing / thundergbm

Out of memory when depth or n_trees are too large #41

------------------------------------------------------------------------

------------------------------------------------------------------------