Closed dohai90 closed 7 years ago
@dohai90 The lib-clblast-mali-overlay
package was a hack by @intelfx to override the standard CLBlast SGEMM kernel with the one optimised by ARM. This hack sort of worked one year ago when CLBlast was around version 0.7-0.8. A generalisation of it is still work-in-progress. Therefore, I don't think it's worth debugging the issue. Today, if you wanted to try ARM's kernel with Caffe, I'd recommend trying Caffe accelerated with the ARM Compute Library. It's not yet available as a CK-Caffe package, but contributions are most welcome.
Now, to answer your question about optimising CLBlast for Caffe. We are actively working on it, so our CK workflows are rather unstable. We are planning to release them soon along with some intriguing results. I'm not sure it's worth sharing them before that.
From a high level view, you install ck-math:lib-clblast-master-universal-tune
first and tune it with CLTune. After that, you install ck-math:lib-caffe-bvlc-opencl-clblast-universal-tune
which picks up your tuned CLBlast installation. However, the first step - CLBlast tuning - has quite a few gotchas, so I would not recommend it today.
Finally, what's the difference between lib-caffe-bvlc-opencl-clblast-universal
and lib-caffe-bvlc-opencl-libdnn-clblast-universal
? The latter uses libDNN, a tunable open-source version of cuDNN, that's part of Greentea (OpenCL Caffe). Typically, libDNN is only used for convolutions. Greentea falls back to a BLAS library to perform matrix operations (e.g. for fully connected layers). In this case, the fall back library is CLBlast. Other packages are available for different fall back options (CUDA, clBLAS, ViennaCL). Does this answer your question?
However, the first step - CLBlast tuning - has quite a few gotchas, so I would not recommend it today.
Having said that, the new threshold for ARM GPUs introduced by @mcian in PR#175 already gives quite good performance on Mali.
Thank you for your help,
But how can I use lib-clblast-mali-overlay on Odroid because I've faced the cmake issue as above, Could you guide me steps to install ck-caffe with lib-clblast-mali-overlay? :)
@dohai90 Sorry, as I said, I don't think it's a good idea to even try lib-clblast-mali-overlay
now, as CLBlast has considerably changed, while the overlay hasn't. Anyway, it was incomplete (only worked for forward propagation on AlexNet and SqueezeNet 1.0, IIRC) and didn't actually provide any great speedups. I've removed this deprecated package from repo:ck-math
.
Dear @psyhtest and @gfursin, thank you for your great work. I have a problem when installing ck-caffe on odroid xu4 with this command: $ ck install package:lib-caffe-bvlc-opencl-clblast-universal --env.DISABLE_DEVICE_HOST_UNIFIED_MEMORY=ON --env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=2 --env.CAFFE_BUILD_PYTHON=ON
and I want to use optimized clblast for mali GPU so I chose as below: and then: finally, cmake reports the following error: `Cloning package from https://github.com/BVLC/caffe/ ... Cloning into 'src'... remote: Counting objects: 49142, done. remote: Compressing objects: 100% (10/10), done. Receiving objects: 100% (49142/49142), 60.29 MiB | 152.00 KiB/s, done. remote: Total 49142 (delta 3), reused 0 (delta 0), pack-reused 49132 Resolving deltas: 100% (33114/33114), done. Checking connectivity... done.
Checking out branch opencl ...
Branch opencl set up to track remote branch opencl from origin. Switched to a new branch 'opencl'
Cleaning ...
Executing extra script ...
Preparing vars for Caffe ...
You are compiling Caffe with Python support! To use it you need to set up CK env as following (after installation):
ck xset env tags=lib,caffe ; . ./tmp-ck-env.bat ; ipython2
Press enter to continue -- The C compiler identification is GNU 5.4.0 -- The CXX compiler identification is GNU 5.4.0 -- Check for working C compiler: /usr/bin/gcc -- Check for working C compiler: /usr/bin/gcc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Check for working CXX compiler: /usr/bin/g++ -- Check for working CXX compiler: /usr/bin/g++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done CMake Warning at /usr/share/cmake-3.5/Modules/FindBoost.cmake:725 (message): Imported targets not available for Boost version 106400 Call Stack (most recent call first): /usr/share/cmake-3.5/Modules/FindBoost.cmake:763 (_Boost_COMPONENT_DEPENDENCIES) /usr/share/cmake-3.5/Modules/FindBoost.cmake:1332 (_Boost_MISSING_DEPENDENCIES) cmake/Dependencies.cmake:8 (find_package) CMakeLists.txt:127 (include)
CMake Warning at /usr/share/cmake-3.5/Modules/FindBoost.cmake:725 (message): Imported targets not available for Boost version 106400 Call Stack (most recent call first): /usr/share/cmake-3.5/Modules/FindBoost.cmake:763 (_Boost_COMPONENT_DEPENDENCIES) /usr/share/cmake-3.5/Modules/FindBoost.cmake:1332 (_Boost_MISSING_DEPENDENCIES) cmake/Dependencies.cmake:8 (find_package) CMakeLists.txt:127 (include)
CMake Warning at /usr/share/cmake-3.5/Modules/FindBoost.cmake:725 (message): Imported targets not available for Boost version 106400 Call Stack (most recent call first): /usr/share/cmake-3.5/Modules/FindBoost.cmake:763 (_Boost_COMPONENT_DEPENDENCIES) /usr/share/cmake-3.5/Modules/FindBoost.cmake:1332 (_Boost_MISSING_DEPENDENCIES) cmake/Dependencies.cmake:8 (find_package) CMakeLists.txt:127 (include)
-- Looking for pthread.h -- Looking for pthread.h - found -- Looking for pthread_create -- Looking for pthread_create - not found -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - found -- Found Threads: TRUE
-- Boost version: 1.64.0 -- Found the following Boost libraries: -- system -- thread -- filesystem -- Found GFlags: /usr/include
-- Found gflags (include: /usr/include, library: /usr/lib/arm-linux-gnueabihf/libgflags.so) -- Found Glog: /usr/include
-- Found glog (include: /usr/include, library: /usr/lib/arm-linux-gnueabihf/libglog.so) -- Found PROTOBUF: /usr/lib/arm-linux-gnueabihf/libprotobuf.so
-- Found PROTOBUF Compiler: /usr/bin/protoc -- Found HDF5: /usr/lib/arm-linux-gnueabihf/hdf5/serial/libhdf5.so;/usr/lib/arm-linux-gnueabihf/hdf5/serial/libhdf5_hl.so;/usr/lib/arm-linux-gnueabihf/hdf5/serial/lib/libhdf5_hl.so;/usr/lib/arm-linux-gnueabihf/hdf5/serial/lib/libhdf5.so;/usr/lib/arm-linux-gnueabihf/libpthread.so;/usr/lib/arm-linux-gnueabihf/libsz.so;/usr/lib/arm-linux-gnueabihf/libz.so;/usr/lib/arm-linux-gnueabihf/libdl.so;/usr/lib/arm-linux-gnueabihf/libm.so (found version "1.8.16") -- Found LMDB: /usr/include
-- Found lmdb (include: /usr/include, library: /usr/lib/arm-linux-gnueabihf/liblmdb.so) -- Found LevelDB: /usr/include
-- Found LevelDB (include: /usr/include, library: /usr/lib/arm-linux-gnueabihf/libleveldb.so) -- Found Snappy: /usr/include
-- Found Snappy (include: /usr/include, library: /usr/lib/arm-linux-gnueabihf/libsnappy.so) -- -- CUDA is disabled. Building without it... -- Found ViennaCL include: /home/odroid/CK-TOOLS/lib-viennacl-1.7.1-linux-32/src -- Found OpenCL: /usr/lib/arm-linux-gnueabihf/mali-egl/libOpenCL.so
-- Found OpenCL include: /usr/include CMake Error at cmake/Dependencies.cmake:192 (find_package): By not providing "FindCLBlast.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "CLBlast", but CMake did not find one.
Could not find a package configuration file provided by "CLBlast" with any of the following names:
Add the installation prefix of "CLBlast" to CMAKE_PREFIX_PATH or set "CLBlast_DIR" to a directory containing one of the above files. If "CLBlast" provides a separate development package or SDK, be sure it has been installed. Call Stack (most recent call first): CMakeLists.txt:127 (include)
-- Configuring incomplete, errors occurred! See also "/home/odroid/CK-TOOLS/lib-caffe-bvlc-opencl-clblast-master-gcc-5.4.0-linux-32/obj/CMakeFiles/CMakeOutput.log". See also "/home/odroid/CK-TOOLS/lib-caffe-bvlc-opencl-clblast-master-gcc-5.4.0-linux-32/obj/CMakeFiles/CMakeError.log". Error: cmake failed! CK error: [package] package installation failed! `
Please specify how I can install caffe framework with optimized clblast on mali GPU? And I have some questions: 1--What are differences between packages?
2--The packages with "tune" suffix are need to be tuned or have been tuned?
Thank you for your help :) Best regard