szagoruyko / imagine-nn

IMAGINE torch neural network routines
Other
109 stars 35 forks source link

gradient tests for inn.ROIPooling, nnf.ROIPooling; also comparison of their outputs #13

Closed 0wu closed 8 years ago

0wu commented 9 years ago

my crude implementation of the Jacobian check for nnf.ROIPooling and inn.ROIPooling.

I made the following not so comfortable choices but not sure if there were any better solutions.

  1. nnf.ROIPooling.lua is copied from the object-detection.torch project/fast-rcnn branch for comparison.
  2. In order to use nn.Jacobian for Gradient testing, I had to subclass both ROIPooling methods such that the subclassed forward()/backward() take single input tensor instead of {input,ROI}.

    Some observations running test_jacobian.lua

  3. SpatialSameResponseNormalization fails. (same failiure in master branch as well).
tingfan@X:~/dnn/imagine-nn/test$ th ./test_jacobian.lua 
Running 3 tests 
_*_  ==> Done Completed 10 asserts in 3 tests with 2 errors 
--------------------------------------------------------------------------------
SpatialSameResponseNormalization
error on state 
 LT(<) violation   val=nan, condition=0.001
    ...tingfan/dnn/torch/install/share/lua/5.1/torch/Tester.lua:26: in function 'assertlt'
    ./test_jacobian.lua:112: in function <./test_jacobian.lua:103>
  1. The two ROIPooling methods give very close results (2-norm diff <1e-7).
  2. The gradient tests pass most of time (diff<1e-3), but not always. Both methods fail at the same test case . Maybe you could help point out some incorrect input?
tingfan@X:~/dnn/imagine-nn/test$ th ./test_jacobian.lua 
Running 6 tests 
___|__  ==> NnfROIPooling                    
Testing module nnf.ROIPooling   
test no 1   
err 4.6730041503906e-05 
test no 2   
err 4.6730041503906e-05 
test no 3   
err 0.025642871856689   
___**|  ==> InnROIPooling                   
Testing module inn.ROIPooling   
test no 1   
err 4.6730041503906e-05 
test no 2   
err 4.6730041503906e-05 
test no 3   
err 0.025642871856689   
fmassa commented 9 years ago

Hi @0wu, Thank you very much for those gradient checks. It's very cool :)

Also I liked your subclassing to use the standard jacobian tests. The only small problem is that we can't evaluate the error wrt the bboxes, but I think that's ok.

We are aware of the failing SpatialSameResponseNormalization problem https://github.com/szagoruyko/imagine-nn/issues/8 , it's due to some change in nn or torch from a couple of months ago (the tests were passing with an old version), but I hadn't had the time to explicitly check which commit is causing the problem.

It's great that both nnf and inn versions of ROIPooling are giving the same results :) I'm curious though that the gradient checks doesn't pass all the time, inn version is mainly an adaptation from original Fast-RCNN implementation to torch, the kernels are the same. I'll have a better look later on today.

fmassa commented 9 years ago

@0wu I'm a bit in a hurry lately, I'll review it this weekend at most.

0wu commented 9 years ago

I figure out the failing gradient test above, it was because the the added "delta" for finite difference changed the maximum element in the maxpooling function. One quick fix is to generate test input carefully such that the elements in each pool are different enough that their order is preserved even with the added "delta". @fmassa , do you want me re-submit a fixed pull-request for this ?

Also, may be I should also test for ROI out of image boundary which is allowed in certain ROIPooling implementation.

fmassa commented 8 years ago

@0wu Sorry for the late reply, things were crazy here with CVPR deadline.

Indeed, there is a difference between nnf and inn implementations of ROIPooling when the pooling region is outside the image. I'll eventually fix it in nnf (but it will involve a memory copy in this case).

Regarding resubmitting the pull request with the fixes, I think you can add the new commits here, no need to create a new PR for that.

Thanks again !

szagoruyko commented 8 years ago

v2 fixes the issue in the backprop, fixed by https://github.com/szagoruyko/imagine-nn/pull/17