microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.59k stars 3.83k forks source link

[gpu] Large dataset In LGBMRegressor Failed #4926

Closed jiapengwen closed 2 years ago

jiapengwen commented 2 years ago

Description

use gpu version, I have a Large dataset, but execute failed

Reproducible example

import pandas as pd
import numpy as np
import os
import time
from lightgbm import LGBMRegressor
from sklearn.datasets import make_classification
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
import lightgbm as lgbm

if __name__ == '__main__':
    time1=time.time()
    # X=np.random.random((4300000,2200)) #work
    # y=np.random.random((4300000)) 

    X=np.random.random((8000000,2200))
    y=np.random.random((8000000))
    # X=np.random.random((7000000,2200)) # work
    # y=np.random.random((7000000))
    time2 = time.time()
    print('contruct data cost:',time2-time1)
    print(X[2][2])
    print(y[2:10])
    time1=time.time()
    model = lgbm.LGBMRegressor(device="gpu",n_estimators=1000,verbose=4,max_bin=16,tree_learner = 'serial',gpu_use_dp='false',n_jobs=1,
                                max_depth=7,num_leaves=31 ,min_child_samples=17000)
    model.fit(X, y,callbacks=[lgbm.log_evaluation()])
    time2 = time.time()
    print('gpu cost:',time2-time1)

######################################
[LightGBM] [Warning] Accuracy may be bad since you didn't explicitly set num_leaves OR 2^max_depth > num_leaves. (num_leaves=31).
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 35200
[LightGBM] [Info] Number of data points in the train set: 8000000, number of used features: 2200
[LightGBM] [Info] Using GPU Device: Quadro RTX 6000, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 16 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 8
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::compute::opencl_error> >'
  what():  Memory Object Allocation Failure

Environment info

LightGBM version or commit hash:

LightGBM version "3.3.1";

Command(s) you used to install LightGBM

pip3 install lightgbm --install=--gpu

python version python3.6.9

Additional Comments

My machine has 227G memory,

jiapengwen commented 2 years ago

I have fix it, already make a pr, please check , https://github.com/microsoft/LightGBM/pull/4928

jiapengwen commented 2 years ago

@jameslamb

jameslamb commented 2 years ago

closed by #4928

github-actions[bot] commented 1 year ago

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.