OS platform, distribution and version (e.g. Linux Ubuntu 16.04): Ubuntu 16.04.6 LTS
Installed from (source or binary): pip
Version: 0.4.0
Python version (optional): 3.6
CUDA/cuDNN version: 10.0
GPU model (optional): Nvidia T4
CPU model: Intel Xeon, 32 cores
RAM available: 200 GB
Description
I want to use the Random Forest Classifier for predictions on a large amount of data but the prediction phase takes oddly much time and shows very low GPU utilization. Here are the parameters I used for training:
The training works pretty well. It's comparably fast and constantly uses around 80% of the GPU (measured with nvidia-smi).
y_pred = model.predict(x_test)
The prediction however only utilizes 4% of the GPU for a fraction of the time it requires to do one iteration (across 10 samples) while it mostly seems to use the CPU with being constantly at 100% for one core. For a class size of 2 it takes around 0.4 seconds, for 10 classes it is 3.4 seconds. Running it on solely CPU-based scikit-learn is faster with only 0.1 seconds.
Is this a general problem of tree-based predictions or am I doing something wrong?
Environment (for bugs)
Description
I want to use the Random Forest Classifier for predictions on a large amount of data but the prediction phase takes oddly much time and shows very low GPU utilization. Here are the parameters I used for training:
The training works pretty well. It's comparably fast and constantly uses around 80% of the GPU (measured with nvidia-smi).
y_pred = model.predict(x_test)
The prediction however only utilizes 4% of the GPU for a fraction of the time it requires to do one iteration (across 10 samples) while it mostly seems to use the CPU with being constantly at 100% for one core. For a class size of 2 it takes around 0.4 seconds, for 10 classes it is 3.4 seconds. Running it on solely CPU-based scikit-learn is faster with only 0.1 seconds.
Is this a general problem of tree-based predictions or am I doing something wrong?
Thanks a lot in advance!