Closed sh1ng closed 5 years ago
Example notebook Untitled (1).zip
Hi @sh1ng, is there a time frame to get this resolved ?
I hope to start in a week or two.
Hi @sh1ng , checking if you were able to make progress on this one. There is an open customer support ticket for it and we would like to get it resolved sooner.
I'm to work on it next week.
My original analysis was wrong.
With current implementation every GPU has to have it's own copy of a whole data-set to utilize data parallelism, i.e. every GPU processes part of alpha/lambda
parameters, but algorithm itself is not parallelised to multiple GPUs.
Need a bit of more analysis to understand how(if) feasible it is. cc @pseudotensor
We decided to use xgboost for linear models. At this point it scales enough and I'm able to handle data set that fails with ElasticNet
.
snippet
param = {'objective': 'reg:squarederror',
'booster': 'gblinear',
'updater': 'gpu_coord_descent',
'n_gpus': -1,
}
dtrain = xgb.DMatrix(x, label=y)
parameter gpu_coord_descent
is not documented yet.
Original request from a customer #762
Elastic net fails when tries to allocate memory on a first GPU while a few are available.
https://github.com/h2oai/h2o4gpu/blob/master/src/gpu/matrix/matrix_dense.cu#L1945-L2016
To consider smarter usage of
upload_data
as it accepts device as a parameter.