naibaf7 / libdnn

Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL
Other
135 stars 35 forks source link

Fixing installation on Windows with latest CMake ... #20

Closed gfursin closed 7 years ago

gfursin commented 7 years ago

Hi Fabian,

When installing this lib on Windows with cmake 3.5.2 and -DBUILD_SHARED_LIBS=ON, I encountered the following error:

CMake Error at src/CMakeLists.txt:36 (install): install Library TARGETS given no DESTINATION!

After Googling the problem, the solution was to add the following line to cMakeLists.txt:

install(TARGETS ${PROJECT_LIBRARY_TARGET_NAME} EXPORT ${CMAKE_TARGETS_NAME}

Is this the correct solution? If yes, maybe it's better to merge it to the master? Thanks a lot!

naibaf7 commented 7 years ago

If it works, it's good enough for now since I'm not an expert in either Windows or CMake :) Thanks :)

naibaf7 commented 7 years ago

@gfursin But oh, you do realize if you want to run Caffe with LibDNN you don't need the standalone LibDNN distribution? A more recent LibDNN implementation is already built-in with Caffe.

gfursin commented 7 years ago

Thanks! Yes, I do know that Caffe has latest LibDNN embedded (I actually just noticed that today when I managed to compile and run OpenCL-based Caffe for Windows ;) ). But I am also adding various standalone implementations of different libraries to our Collective Knowledge Framework preparing our crowd-benchmarking effort: https://github.com/ctuning/ck-math/tree/master/package

naibaf7 commented 7 years ago

Ah ok great. I will actually upgrade this version of LibDNN to the latest kernels used in Caffe, just didn't get to it yet (exams...). OT: Is OpenCL Caffe working well on Windows for you (I hope so)?

gfursin commented 7 years ago

Sure, take your time - that's not critical at the moment! And good luck with exams!

As for OpenCL Caffe - the good thing is that I managed to compile it with all dependencies directly via CK (it automatically installs all deps on Windows now) and I also managed to run caffe time and classification on my very old Intel 4400 GPU with 1.2 OpenCL. The bad thing is that currently it's about 2 times slower than running it on 2-core CPU from the same machine (Laptop.Lenovo.ThinkPad.X240) - Intel Core i5-4210.

I use device 0:

  I0118 23:48:46.655195 12008 common.cpp:379] Total devices: 3
  I0118 23:48:46.655195 12008 common.cpp:380] CUDA devices: 0
  I0118 23:48:46.655195 12008 common.cpp:381] OpenCL devices: 3
  I0118 23:48:46.655195 12008 common.cpp:405] Device id:                     0
  I0118 23:48:46.655195 12008 common.cpp:407] Device backend:                OpenCL
  I0118 23:48:46.655195 12008 common.cpp:409] Backend details:               Intel(R) Corporation: OpenCL 1.2
  I0118 23:48:46.655195 12008 common.cpp:411] Device vendor:                 Intel(R) Corporation
  I0118 23:48:46.655195 12008 common.cpp:413] Name:                          Intel(R) HD Graphics 4400
  I0118 23:48:46.655195 12008 common.cpp:415] Total global memory:           1708759450
  I0118 23:48:46.655195 12008 common.cpp:405] Device id:                     1
  I0118 23:48:46.655195 12008 common.cpp:407] Device backend:                OpenCL
  I0118 23:48:46.655195 12008 common.cpp:409] Backend details:               Intel(R) Corporation: OpenCL 1.2
  I0118 23:48:46.655195 12008 common.cpp:411] Device vendor:                 Intel(R) Corporation
  I0118 23:48:46.655195 12008 common.cpp:413] Name:                          Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz
  I0118 23:48:46.655195 12008 common.cpp:415] Total global memory:           8262414336
  I0118 23:48:46.655195 12008 common.cpp:405] Device id:                     2
  I0118 23:48:46.655195 12008 common.cpp:407] Device backend:                OpenCL
  I0118 23:48:46.655195 12008 common.cpp:409] Backend details:               Intel(R) Corporation: OpenCL 2.0
  I0118 23:48:46.655195 12008 common.cpp:411] Device vendor:                 Intel(R) Corporation
  I0118 23:48:46.655195 12008 common.cpp:413] Name:                          Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz
  I0118 23:48:46.655195 12008 common.cpp:415] Total global memory:           8262414336

Also, when using device 1 and 2, caffe crashes after about 10 secs ...

I am now checking if I did not do anything wrong myself during setup, and may give you a report on OpenCL branch - I saw a few discussions on OpenCL Caffe for Windows there ...

However, in spite of these issues, it's a really great progress - thanks a lot!

naibaf7 commented 7 years ago

@gfursin Oh ok, yes, Intel chips. Currently investigating with Intel how we can get their kernels running on Windows as well (INTEL_SPATIAL), and integrate their ideas/kernels in LibDNN - they're quite a bit faster right now than LibDNN (which has been mostly programmed with AMD and nVidia GPUs in mind, however more architectures will get options in the tuning parameter space as it evolves). The Intel spatial kernels perform best on Skylake and Kaby Lake.

gfursin commented 7 years ago

Sure, that's correct (and exploring parameter space of various libs across distinct hardware while preserving such statistics is my main long-term interest). But I am still surprised that it is so much slower than 2-core mobile CPU. I had an issue to turn on INTEL_SPATIAL on Windows (had the same issue as in https://github.com/naibaf7/libdnn/pull/20) so will be interested to see how it will affect the performance in the future. But again, please take your time - exams are more important!

gfursin commented 7 years ago

Actually, just saw your message in above thread - let's move discussions there! Once again thanks a lot for helping with that!!!