add viennacl (minus doc, exmaples dirws)

naibaf7 / libdnn

Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL

Other

135 stars 35 forks source link

add viennacl (minus doc, exmaples dirws) #6

Closed hughperkins closed 8 years ago

hughperkins commented 8 years ago

address https://github.com/naibaf7/libdnn/issues/5

naibaf7 commented 8 years ago

@hughperkins Do you think @karlrupp is OK with adding the source as-is? Wouldn't it be smarter to use a GIT submodule or similar?

hughperkins commented 8 years ago

Well, I initially was going to do a git submodule (It's what I'm using for your own code), but there are quite a lot of extra bonus files in the viennacl repo, not necessary for production usage.

edgarriba commented 8 years ago

another solution could be with ExternalProject_add/step to install headers and then remove it

bhack commented 8 years ago

What is the plan? How we want to support others ops/kernels? If we want to increase ops/kernel coverage actually caffe opencl has almost three implementations with clblas, clblast and Vienna. What we want to do here? /cc @CNugteren

hughperkins commented 8 years ago

@bhack well, short-term 'batteries-included' seems not a bad plan. Longer-term, maybe make the various implementations pluggable, discoverable at runtime?

hughperkins commented 8 years ago

oh wait, are you saying, ViennaCl, is optional, eg could choose eg CLblast instead?

naibaf7 commented 8 years ago

@hughperkins No. Caffe can use CLBlast, ViennaCL, clBLAS and ISAAC as BLAS for auxiliary functions and im2col/col2im convolutions.

But libdnn needs ViennaCL at the moment (context handling, kernel launching) but not as BLAS (libdnn does not need a BLAS).

hughperkins commented 8 years ago

But libdnn needs ViennaCL at the moment (context handling, kernel launching)

Ah. Thats a lot of code for just launching kernels. But it sounds like you're migrating to some other system soonish?

naibaf7 commented 8 years ago

@hughperkins It is planned to have the possibility of using libdnn only with pure OpenCL/CUDA without ViennaCL; but it's not the top-priority right now.

bhack commented 8 years ago

But I don't think that we can increase the kernels/ops coverage without a blas implementaton (if we want to port kernel/ops upstream here from caffe opencl branch). We want to have the same approach of opencl caffe branch here?

hughperkins commented 8 years ago

@bhack Hmmm, that's true. I've been so focused on convolution, I forgot that basic Linear layers simply, I imagine, call into standard blas3 matrix multiplication.

bhack commented 8 years ago

I think that the original @naibaf7 choiche on viennacl was formulated to have device, memory, context etc. This was reivented by many libraries or headers, but he added a dependencies that could give blas ops. But then opencl caffe branch started to support multi blas efforts so device, memory, context and blas are not so correlated anymore. What is the strategy here? Use Vienna only as one of the blas alternative and use something more ligthweigth for bootstrap and management?

@naibaf7 Excuse me if the reconstruction is incorrect or precise.

naibaf7 commented 8 years ago

@bhack This is a very precise description. Going forward in another direction is just getting a bit delayed due to more people involved with libdnn & Caffe now, such as Intel. So I collect more opinions before doing drastic changes.

bhack commented 8 years ago

It is understandable. We need to plan something for GSoC. So while you are trying to collect feedbacks could Edgar propose some API extension mainly to use memory management from Vienna here? Next we want to cover ops for a simple Mnist network here and then some more complex net archs.

naibaf7 commented 8 years ago

@bhack Sure, go ahead.

bhack commented 8 years ago

Ok. For the general discussion and midterm strategy I think that an extended version of CLCudaAPI could be evaluated. Also libocca has a nice API design for bootstrap and management and has more backends coverage (including HSA).