Closed hemenkapadia closed 5 years ago
bad_alloc
occurs in case there's not enough memory. Not related to #311
I was able to run GBM on g3.16xlarge(4 GPUs with 8Gb of RAM as P4). Can I get code example? Also attached notebook that works.
Elastic net indeed fails at https://github.com/h2oai/h2o4gpu/blob/master/src/gpu/matrix/matrix_dense.cu#L1963 as a result of allocation attempt of the whole matrix on a single GPU. We need to improve this part.
If they use one-hot encoding for categories it's inefficient and should be avoided.
Hi @sh1ng , I shared the python notebook with you on slack.
xgboost part can be solved by setting 'n_gpus': -1,
Don't really know why it's not set by default.
xgb_params = {'max_depth':8,
'objective':'binary:logistic',
'min_child_weight':30,
'eta':0.1, #learning rate
'scale_pos_weight':2,
'gamma':0.1, #min_split_loss
'reg_lamda':0.5, #L2-regularization term
'tree_method':'gpu_hist',
'n_gpus': -1, }
Elastic net issue is going to be fixed in #763
Customer is reporting H2O4GPU models (GBM and ElasticNet) produced an error message ‘terminate called after throwing an instance of 'thrust::system::detail::bad_alloc’’ and eventually broke jupyter notebook kernel. Retrying results in similar behavior.
The instance is a GCP VM with 32 vCPUs and 4 Tesla P4 GPUs. Dataset used is http://kt.ijs.si/elena_ikonomovska/data.html, which has about 116 million records and is 5.76 GB.
Is this related to #311 ?