BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
33.96k stars 18.72k forks source link

OpenCL Caffe cannot use two (or more) OpenCL devices from different platforms #7018

Open 1970633640 opened 2 years ago

1970633640 commented 2 years ago

Question: Can two OpenCL devices from different platforms (example: 1x NVIDIA GPU + 1x AMD GPU) be used at the same time to accelerate training process in OpenCL Caffe?

During my test, after init, program will exit with error before the first iteration. (Already disabled host_unified_memory)

It is possible to speed up training process with multi OpenCL devices, right? Because this is the advantage of OpenCL? Are there any forks of OpenCL-Caffe or are there any instructions to achieve speed-up with multi OpenCL devices? Can some one please give some projects or documents or instructions to use multiple OpenCL devices at the same time?

I guess if two (or more) devices are initiated in ViennaCL and one queue is assigned to each device. During training process, OpenCl kernels can be separated and clEnqueueNDRange to these queues?

rajhlinux commented 1 year ago

How were you able to build caffe with OpenCL support, using linux?

1970633640 commented 1 year ago

How were you able to build caffe with OpenCL support, using linux?

In my tests, OpenCL Caffe is slower on NVIDIA GPU than CUDA version and it is too slow on Intel CPU, so I do not recommend it.

But if you want to use it any way:

Fitst, install libraries I copied from dockerfile:

sudo apt-get update
sudo apt-get install -y --no-install-recommends \
        build-essential \
        cmake \
        git \
        wget \
        libatlas-base-dev \
        libboost-all-dev \
        libgflags-dev \
        libgoogle-glog-dev \
        libhdf5-serial-dev \
        libleveldb-dev \
        liblmdb-dev \
        libopencv-dev \
        libprotobuf-dev \
        libsnappy-dev \
        protobuf-compiler \
        python-dev \
        python-numpy \
        python-pip \
        python-setuptools \
        python-scipy

and

sudo apt-get install python3-pip
sudo apt install libsqlite3-dev

and install all python requirements.

download and extract zip file or

git clone https://github.com/BVLC/caffe.git

GPU or Intel OpenCL driver is required, too. You can install NVIDIA driver (driver should be enough, but CUDA may be better) or "Intel SDK for Opencl Applications" driver.

Finally, in the extracted code folder:

mkdir build
cd build
cmake .. -DUSE_OPENCL=1
make -j8 (8 is total cpu cores)

OpenCV version may cause compile errors. If it happes, replace "CV_LOAD_IMAGE_COLOR" and "CV_LOAD_IMAGE_GRAYSCALE" to "cv::IMREAD_COLOR" and "cv::IMREAD_GRAYSCALE" in the code (use VSCode, Jetbrains IDEs, etc to search codes in all source files) and compile again.

reference https://github.com/BVLC/caffe/issues/6680

After a successful compilation, I can use -gpu 0 to select NVIDIA GPU as OpenCL device and -gpu 1 to select Intel CPU as OpenCL device and -cpu to select Intel CPU as CPU device when training, etc.

I prefer Ubuntu 16, 18 LTS versions and I had successful builds on these versions, but this should work on other systems.