Porting Convolutions and im2col

ViliusT commented 7 years ago

Hi hughperkins,

Using my limited knowledge, I've been trying to port some functions from to the openCL kernel using the guidelines suggested;

I noticed that kernel for SpatialConvolutionMM is called from im2col.cpp - https://github.com/hughperkins/clnn/blob/a7453eb724cb2fbd8845da239f93ab0657f49dc2/lib/THCLNN/im2col.cpp#L26 / https://github.com/hughperkins/clnn/blob/a7453eb724cb2fbd8845da239f93ab0657f49dc2/lib/THCLNN/im2col.cpp#L42

Spatial full convolution is very similar to SpatialConvolution, and seemed like it would make sense to follow what is already coded for SpatialConvolutionMM. Since there is a kernel call already in im2col, and no separate cl file gets generated for SpatialFullConvolution it from the port script - how should one call SpatialFullConvolution from the im2col, without breaking SpatialFullConvolution?

...or am I looking at it the wrong way? You can see my first shot at it here: https://github.com/ViliusT/clnn/compare/master...ViliusT:feature/spatialFullConv

hughperkins commented 7 years ago

So, not sure if this will answer your question, but quick background on what is im2col. Actually, some guy wrote a blog post about it, and can explain it better than me, let me dig that out. This one: https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/

So, what happens in convolution is:

incoming multi-dimensional tensors get 'unrolled' into big fluffy 2d matrices. im2col does this. im means the incoming tensor, and col means the unrolled 2d matrices
then a standard GEMM is run on these matrices, to produce a new matrix
the new matrix is re-rolled up to produce the output multi-dimensional tensor, this time using col2im, ie the other way around

I havent (yet) looked into what SpatialFullConvolution is doing, but presumably it does the exact same thing, just the geometry is slightly different?

hughperkins commented 7 years ago

(so, to perhaps answer your question: im2col is not so much the 'kernel for spatialconvolutoin', so much as a library routine, that is used by spatialconvolution, to unroll the matrices. then spatialconvolution calls GEMM. So, the relationship between the files/libraries looks something like:

spatialconvolution

ViliusT commented 7 years ago

Hi, thanks for the informative reply!

SpatialFullConvolution is also known as "Deconvolution", "Fractionaly strided convolution" and a host of other names;

In short, a forward pass on a Convolution is close to a backwards pass on FullConvolution/Deconvolution and vice-versa - you could see it at https://github.com/torch/cunn/blob/master/lib/THCUNN/SpatialFullConvolution.cu and https://github.com/torch/cunn/blob/master/lib/THCUNN/SpatialConvolutionMM.cu

It seemed like nearly a perfect candidate for porting attempt on my side, after few simpler implementations I did on LUA side with custom kernel calls.

Hmm..

So looking at the code - https://github.com/hughperkins/clnn/blob/a7453eb724cb2fbd8845da239f93ab0657f49dc2/lib/THCLNN/SpatialConvolutionMM.cpp, it seems that I got the idea correctly, but the issue is somewhere else - I assumed that this is due to the fact that im2col.ccp has this line std::string uniqueName = "SpatialConvolutionMM::im2col";

I would somehow should add std::string uniqueName = "SpatialFullConvolution::im2col"; as well for it to be callable from SpatialFullConvolution - is that incorrect?

hughperkins commented 7 years ago

What makes you feel you need to modify im2col?

On September 7, 2016 1:44:27 PM GMT+01:00, ViliusT notifications@github.com wrote:

Hi, thanks for the informative reply!

SpatialFullConvolution is also known as "Deconvolution", "Fractionaly strided convolution" and a host of other names;

In short, a forward pass on a Convolution is close to a backwards pass on FullConvolution/Deconvolution and vice-versa - you could see it at https://github.com/torch/cunn/blob/master/lib/THCUNN/SpatialFullConvolution.cu and https://github.com/torch/cunn/blob/master/lib/THCUNN/SpatialConvolutionMM.cu

It seemed like nearly a perfect candidate for porting attempt on my side, after few simpler implementations I did on LUA side with custom kernel calls.

Hmm..

So looking at the code - https://github.com/hughperkins/clnn/blob/a7453eb724cb2fbd8845da239f93ab0657f49dc2/lib/THCLNN/SpatialConvolutionMM.cpp, it seems that I got the idea correctly, but the issue is somewhere else

I assumed that this is due to the fact that im2col.ccp has this line std::string uniqueName = "SpatialConvolutionMM::im2col";

I would somehow should add std::string uniqueName = "SpatialFullConvolution::im2col"; as well for it to be callable from SpatialFullConvolution - is that incorrect?

You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/hughperkins/clnn/issues/42#issuecomment-245268741

Sent from my Android device with K-9 Mail. Please excuse my brevity.

ViliusT commented 7 years ago

Evidently, lack of experience :)

https://github.com/hughperkins/clnn/blob/a7453eb724cb2fbd8845da239f93ab0657f49dc2/THCLNN.lua#L164 seems to bind Cl function equivalents to standard THNN functions;

I assume, when the correct .cpp is added to CMakeLists.txt , it should automatically be added to THCLNN.h / THCLNN.lua when building - this is where it fails for me. I (wrongly?) assumed that this is due to im2col.cpp, since when comparing to SpatialConvolutionMM implementation, this was the only difference I could spot - there was a reference in im2col to SpatialConvolutionMM in

    TemplatedKernel kernelBuilder(cl);
    kernel = kernelBuilder.buildKernel(uniqueName, "SpatialConvolutionMM.cl",
      getKernelTemplate(), "col2im_kernel");

(even though you mentioned in the guideliness that this filename is not improtant) From lack of deeper understanding, I came to a conclusion that this caused the build to fail to include the calls in THCLNN.h; I'll have another look at the code and see if missed something.

hughperkins commented 7 years ago

there was a reference in im2col to SpatialConvolutionMM

Ah, good point. Updated im2col in d1701e6 to no longer contain references to spatialconvolution :-)

hughperkins commented 7 years ago

Your branch builds ok for me:

$ git checkout -b vilius-spatialfullconv vilius/feature/spatialFullConv 
Branch vilius-spatialfullconv set up to track remote branch feature/spatialFullConv from vilius.
Switched to a new branch 'vilius-spatialfullconv'
$ luarocks make rocks/clnn-scm-1.rockspec 
cmake -E make_directory build && cd build && cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_PREFIX_PATH="/home/ubuntu/torch-cl/install/bin/.." -DCMAKE_INSTALL_PREFIX="/home/username/torch-cl/install/lib/luarocks/rocks/clnn/scm-1" && make -j$(getconf _NPROCESSORS_ONLN) install

-- Found Torch7 in /home/ubuntu/torch-cl/install
-- Configuring done
-- Generating done
-- Build files have been written to: /home/username/torch-cl/opencl/clnn/build
Scanning dependencies of target THCLNN
[  0%] Built target cog_clnn
Scanning dependencies of target clnn
[  6%] Building CXX object CMakeFiles/clnn.dir/init.cpp.o
[ 13%] Building CXX object lib/THCLNN/CMakeFiles/THCLNN.dir/SpatialFullConvolution.cpp.o
[ 20%] Building CXX object lib/THCLNN/CMakeFiles/THCLNN.dir/im2col.cpp.o
[ 26%] Linking CXX shared module libclnn.so
[ 33%] Built target clnn
[ 40%] Linking CXX shared module libTHCLNN.so
[100%] Built target THCLNN
Install the project...

Do you mean, it fails at runtime?

ViliusT commented 7 years ago

Yes; Apologies - I explained it improperly;

Upon calling the SpatialFullConvolution as a module via LUA, say, such testing script:


local input = torch.zeros(3,12,12)
input = input:cl()
local gconv = nn.SpatialFullConvolution(3,1,3,3,2,2):cl()
print (gconv)
local out = gconv:forward(input)
print (out)

It fails with the following:

/home/vilius/torch-cl/install/bin/luajit: ...cl/install/share/lua/5.1/clnn/SpatialFullConvolution.lua:55: attempt to call field 'SpatialFullConvolution_updateOutput' (a nil value)
stack traceback:
    ...cl/install/share/lua/5.1/clnn/SpatialFullConvolution.lua:55: in function 'forward'
    TestQuick.lua:38: in main chunk
    [C]: in function 'dofile'
    ...s/torch-cl/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

The line in question is this call to function:

 inputTensor.THNN.SpatialFullConvolution_updateOutput(
    inputTensor:cdata(),
    self.output:cdata(),
    self.weight:cdata(),
    THNN.optionalTensor(self.bias),
    self.finput:cdata(),
    self.fgradInput:cdata(),
    self.nInputPlane,
    self.nOutputPlane,
    self.kW, self.kH,
    self.dW, self.dH,
    self.padW, self.padH,
    adjW, adjH
  )

I take it, this is because there is no binding in https://github.com/hughperkins/clnn/blob/a7453eb724cb2fbd8845da239f93ab0657f49dc2/THCLNN.lua for the function I wrote - and it is not created when I try to build; It is not there for SpatialFullConvolution after using the python port script either;

In short - I am not sure how to make my function callable from LUA. I assume I need to bind it somehow, as other functions are - but how to achieve that?

ViliusT commented 7 years ago

And if I add them manually (see https://github.com/ViliusT/clnn/commit/82321b7a10e778d90a443ca81145ca98f1170424 ), upon trying to require 'clnn' I get


libthclnn_searchpath    /home/vilius/torch-cl/install/lib/lua/5.1/libTHCLNN.so  
not found: THNN_ClSpatialFullConvolution_updateOutput/home/vilius/torch-cl/install/share/lua/5.1/nn/THNN.lua:804: /home/vilius/torch-cl/install/lib/lua/5.1/libTHCLNN.so: undefined symbol: THNN_ClSpatialFullConvolution_updateOutput  
not found: THNN_ClSpatialFullConvolution_updateGradInput/home/vilius/torch-cl/install/share/lua/5.1/nn/THNN.lua:804: /home/vilius/torch-cl/install/lib/lua/5.1/libTHCLNN.so: undefined symbol: THNN_ClSpatialFullConvolution_updateGradInput    
not found: THNN_ClSpatialFullConvolution_accGradParameters/home/vilius/torch-cl/install/share/lua/5.1/nn/THNN.lua:804: /home/vilius/torch-cl/install/lib/lua/5.1/libTHCLNN.so: undefined symbol: THNN_ClSpatialFullConvolution_accGradParameters
/home/vilius/torch-cl/install/bin/luajit: /home/vilius/torch-cl/install/share/lua/5.1/trepl/init.lua:384: /home/vilius/torch-cl/install/share/lua/5.1/clnn/test.lua:1: attempt to call field 'TestSuite' (a nil value)
stack traceback:
    [C]: in function 'error'
    /home/vilius/torch-cl/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
    justatest.lua:4: in main chunk
    [C]: in function 'dofile'
    ...s/torch-cl/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406670

hughperkins commented 7 years ago

Alright, so working through this ,and remembering at the same time, this is realtime :-P Or at least this is what I did, in no particular sequence, just what seemed an ok-ish guess at the time

# check does the library contain our exported symbol?
$ nm -C ~/torch-cl/install/lib/libTHCl.so | grep SpatialFullConvolution
# and... nothing :-P

Seems like you havent exported the symbol? This means it's probably not in the header file. check the headerfile:

$ grep SpatialFull lib/THCLNN/THCLNN.h 
TH_API void THNN_ClSpatialFullConvolution_updateOutput(
TH_API void THNN_ClSpatialFullConvolution_updateGradInput(
TH_API void THNN_ClSpatialFullConvolution_accGradParameters(

Ok. its in the headerfile. Is it in the CMakeLists.txt? :-P

$ grep SpatialFull lib/THCLNN/CMakeLists.txt  
set(src-cl Abs.cpp SpatialConvolutionMM.cpp SpatialFullConvolution.cpp ELU.cpp SpatialAveragePooling.cpp SpatialMaxPooling.cpp

Yes it is... hmmm... is it in SpatialFullConvolution.cpp?

$ grep ClSpatialFull lib/THCLNN/SpatialFullConvolution.cpp 
void THNN_ClSpatialFullConvolution_updateOutput(
void THNN_ClSpatialFullConvolution_updateGradInput(
void THNN_ClSpatialFullConvolution_accGradParameters(
std::string THNN_ClSpatialFullConvolution_getKernelTemplate() {

Yes, it is. Ummm... Check that other ... oh, I'm looking in the wrong .so :-P I'm looking in cltorch so, not in clnn .so. Check clnn .so:

$ nm -C ~/torch-cl/install/lib/lua/5.1/libTHCLNN.so | grep Convolution
00000000000034a0 t _GLOBAL__sub_I_SpatialConvolutionMM.cpp
00000000000034d0 t _GLOBAL__sub_I_SpatialFullConvolution.cpp
00000000000044b0 T THNN_ClSpatialConvolutionMM_accGradParameters
0000000000003ff0 T THNN_ClSpatialConvolutionMM_updateGradInput
00000000000039e0 T THNN_ClSpatialConvolutionMM_updateOutput
0000000000004a20 T THNN_ClSpatialFullConvolution_updateOutput(THClState*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, int, int, int, int, int, int, int, int, int, int)
0000000000004f00 T THNN_ClSpatialFullConvolution_updateGradInput(THClState*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, int, int, int, int, int, int, int, int, int, int)
00000000000052e0 T THNN_ClSpatialFullConvolution_accGradParameters(THClState*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, THClTensor*, int, int, int, int, int, int, int, int, int, int, float)

ok, there it is, its exported. Is it a C export or a C++ export? If it's C export (what we want), if I remove the -C from nm, ti should still show an unmangled symbol:

00000000000034a0 t _GLOBAL__sub_I_SpatialConvolutionMM.cpp
00000000000034d0 t _GLOBAL__sub_I_SpatialFullConvolution.cpp
00000000000044b0 T THNN_ClSpatialConvolutionMM_accGradParameters
0000000000003ff0 T THNN_ClSpatialConvolutionMM_updateGradInput
00000000000039e0 T THNN_ClSpatialConvolutionMM_updateOutput
0000000000004a20 T _Z42THNN_ClSpatialFullConvolution_updateOutputP9THClStateP10THClTensorS2_S2_S2_S2_S2_iiiiiiiiii
0000000000004f00 T _Z45THNN_ClSpatialFullConvolution_updateGradInputP9THClStateP10THClTensorS2_S2_S2_S2_iiiiiiiiii
00000000000052e0 T _Z47THNN_ClSpatialFullConvolution_accGradParametersP9THClStateP10THClTensorS2_S2_S2_S2_S2_iiiiiiiiiif

Looks mangled. Compare with SpatialConvolutionMM. Means when we compile the .cpp file, we are missing an extern "C" { on our symbols. Does the .cpp file import the appropriate header .h file?

$ grep include lib/THCLNN/SpatialFullConvolution.cpp
lib/THCLNN/SpatialFullConvolution.cpp:#include "utils.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "luaT.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THClTensor.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THClTensorMath.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THClBlas.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THClKernels.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "templates/TemplatedKernel.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "DeviceInfo.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "EasyCL.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "THCLNN.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include "im2col.h"
lib/THCLNN/SpatialFullConvolution.cpp:#include <iostream>
lib/THCLNN/SpatialFullConvolution.cpp:#include <string>

Does not. I reckon if you add an include for SpatialFullConvolution.h, and rebuild, it will magically get mildly further :-)

hughperkins commented 7 years ago

Oh wait, it does include it, it's the THCLNN.h file. Hmmm...

hughperkins commented 7 years ago

let's compare the declaration with the definition. Declaration is:

TH_API void THNN_ClSpatialFullConvolution_updateOutput(
          THClState *state,
          THClTensor *input,
          THClTensor *output,
          THClTensor *weight,
          THClTensor *bias,
          THClTensor *columns,
          THClTensor *ones,
          int kW, int kH,
          int dW, int dH,
          int padW, int padH,
          int adjW, int adjH);

Definition (.cpp file) is:

void THNN_ClSpatialFullConvolution_updateOutput(
  THClState *state,
  THClTensor *input,
  THClTensor *output,
  THClTensor *weight,
  THClTensor *bias,
  THClTensor *columns,
  THClTensor *ones,
  int nInputPlane,
  int nOutputPlane,
  int kW, int kH,
  int dW, int dH,
  int padW, int padH,
  int adjW, int adjH) {

Ah, one has nInputPlane, nOutputPlane, and one does not, and therefore when it builds the definition, it ignores the declaration, which is for a different function essentially.

I think that if you reconcile the declaration with the definition, it might get further.

hughperkins commented 7 years ago

(TH_API contains extern "C" by the way. in ~/torch-cl/pkg/torch/TH/THGeneral.h.in:

https://github.com/hughperkins/torch7/blob/master/lib/TH/THGeneral.h.in#L18-L32

#ifdef __cplusplus
# define TH_EXTERNC extern "C"
#else
# define TH_EXTERNC extern
#endif

#ifdef _WIN32
# ifdef TH_EXPORTS
#  define TH_API TH_EXTERNC __declspec(dllexport)
# else
#  define TH_API TH_EXTERNC __declspec(dllimport)
# endif
#else
# define TH_API TH_EXTERNC
#endif

)

ViliusT commented 7 years ago

Oh wow!

Before we dig further...

This commit - https://github.com/torch/nn/commit/b339dad44739267bd79a0eae0fb158bfcab5991b - it seems it is not in the torch-cl distribution?

Hence, no THNN_(SpatialFullConvolution_<...>) calls in THNN.lua in the nn package, and then, THNN_ClSpatialFullConvolution_<..> does't have anything to bind with, no?

hughperkins commented 7 years ago

This commit - torch/nn@b339dad - it seems it is not in the torch-cl distribution?

Yes, so, good observation. So, this commit is dated 31st january, but it was merged to master on 18th February: https://github.com/torch/nn/pull/601

Now, ... as a maintainer of an OpenCL port of torch, the THNN migration caused me a llloottt of pain:

Since I do have a full time job, I couldnt really keep up with these changes, and clnn basically spent half its time not working. So, as you can see in issue 39, I finally forked https://github.com/torch/distro to https://github.com/hughperkins/distro-cl , and rolled everything back to ~21 Februrary.

So, you might think that 18th Februrary should get in, but its "approximately" 21 february, and I guess it just slightly missed.

On the whole, in general, the path of least resistance is to pretend that any changes to torch from then on didnt happen, or occasionally cherry-pick anything really critical.

To what extent do you need any changes from after distro was forked to distro-cl, and to what extent can you get things working using the ~21 February version of distro?

ViliusT commented 7 years ago

I knew that distro-cl was forked at end of February - since I saw that that commit was from January, didn't double check and just assumed that it was in; My mistake.

I'll see what I can do and get back to you.

And a big thanks for maintaining the project!

hughperkins commented 7 years ago

k, sounds good :-)

Not saying we cant port stuff from upstream by the way. But my experience is that any single upstream commit contains a whole bunch of changes, and conversely a specific change you want might be spread over multiple commits, so bringing stuff from upstream is a bit icky :-P

Nevertheless can be done, with a bit of effort.

hughperkins / clnn

Porting Convolutions and im2col #42