TWOEARS / blackboard-system

Two!Ears Auditory Model - Blackboard system module
http://docs.twoears.eu/en/latest/blackboard/
GNU General Public License v2.0
3 stars 2 forks source link

DNNLocation using Caffe for GPU accleration #31

Closed kashefy closed 7 years ago

kashefy commented 7 years ago

This performs inference for current DNNLocation models through the caffe library instead of the current matlab implementation to take advantage of GPU acceleration in the Caffe library.

This also moves the CaffeModel class into the blackboard repository. This class will also be used as the backend for the IdentityLocationKS. The Caffe backend can be used in CPU mode if no GPU is available.

Numerical difference between the current matlab inference and inference via Caffe is < 1e-7. The speedup is noticeable but I'm still profiling the code to get quantify the speedup in processing.

Only the default models have been converted so far.

kashefy commented 7 years ago

Inference on GPU is approx 1.5 times faster.

kashefy commented 7 years ago

Sorry, it's twice as fast.

ivo--t commented 7 years ago

Very nice. Do you by chance also have compared the speed between the two variants without gpu?

kashefy commented 7 years ago

Comparing Caffe in CPU mode, I found it slightly, just slightly faster than the Matlab implementation. Not more than 10% speedup. In CPU mode, Caffe is configured to perform its matrix multiplication using OpenBLAS. I'd like to think that BLAS is as much optimized as MATLAB's low-level implementation for matrix multiplication.

ivo--t commented 7 years ago

Sounds good.

ningma97 commented 7 years ago

Excellent! Is the Caffe binary included anywhere for people who haven't got it?

What would be really interesting is to use Caffe for training the DNNs. I wanted to do this or using other toolkits but haven't got round to do so.

kashefy commented 7 years ago

Unfortunately not. Since everything is linked dynamically, you need to have library and everything it links to locally on your machine. I haven't gotten around to producing a static standalone build, but this will also require building for as many CUDA archutectures as possible to work with a broader range of GPUs. For now, the caffe library has to be built from source. The build instructions are available on the caffe website. I've also put together instructions on building caffe + its dependencies from source here. This is useful if you don't have sudo privileges. I also recommend using a more recent version of cmake than the version available through apt-get (ver. >= 3.5.2)