Open mks0601 opened 6 years ago
I also noticed this problem. There is a gap between numerical grad and analytical grad. But outputs and grads of pytorch version and tensorflow version are almost the same.
numerical:(
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.2012 0.0000 0.0000 0.0000
0.6258 0.1490 0.0000 0.0000
0.0000 0.6855 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0373 0.0000 0.1788 0.0000
0.1341 0.0298 0.5662 0.1192
0.0000 0.1490 0.0000 0.5960
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0596 0.0000
0.0000 0.0000 0.2086 0.0596
0.0000 0.0000 0.0000 0.2384
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
[torch.FloatTensor of size 25x4]
,)
analytical:(
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.2111 0.0000 0.0000 0.0000
0.6141 0.1408 0.0000 0.0000
0.0000 0.6844 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0447 0.0000 0.1893 0.0000
0.1300 0.0298 0.5507 0.1263
0.0000 0.1449 0.0000 0.6138
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0665 0.0000
0.0000 0.0000 0.1934 0.0444
0.0000 0.0000 0.0000 0.2156
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
[torch.FloatTensor of size 25x4]
,)
Thank you for check Did you achieved the similar result with mask-rcnn with your roi align module?
-- Gyeongsik Moon Ph.D. Candidate Department of ECE, SNU, Seoul, Korea http://cv.snu.ac.kr http://cv.snu.ac.kr/
- 오후 1:02, longcw notifications@github.com 작성:
I also noticed this problem. There is a gap between numerical grad and analytical grad. But outputs and grads of pytorch version and tensorflow version are almost the same.
numerical:( 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.2012 0.0000 0.0000 0.0000 0.6258 0.1490 0.0000 0.0000 0.0000 0.6855 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0373 0.0000 0.1788 0.0000 0.1341 0.0298 0.5662 0.1192 0.0000 0.1490 0.0000 0.5960 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0596 0.0000 0.0000 0.0000 0.2086 0.0596 0.0000 0.0000 0.0000 0.2384 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [torch.FloatTensor of size 25x4] ,) analytical:( 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.2111 0.0000 0.0000 0.0000 0.6141 0.1408 0.0000 0.0000 0.0000 0.6844 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0447 0.0000 0.1893 0.0000 0.1300 0.0298 0.5507 0.1263 0.0000 0.1449 0.0000 0.6138 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0665 0.0000 0.0000 0.0000 0.1934 0.0444 0.0000 0.0000 0.0000 0.2156 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [torch.FloatTensor of size 25x4] ,) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/longcw/RoIAlign.pytorch/issues/1#issuecomment-350523583, or mute the thread https://github.com/notifications/unsubscribe-auth/AM-Lu8_QGoegiZsOVSnjnQNJU02rcEUeks5s-1fLgaJpZM4Q8J9C.
I am not working on Mask RCNN.
BTW, I found that this layer can pass the gradcheck if I set eps=1e-3
.
eps
is the perturbation for finite differences.
gradcheck(roi_align, (image_torch, boxes, box_index), eps=1e-3)
output (max_val, min_error, max_error, mean_error):
('forward:', 0.87139809, 0.0, 7.0184469e-06, 5.5792748e-07)
('backward:', 1.0228419, 0.0, 1.7911196e-05, 9.7078487e-09)
test ok
How many time did you run the gradcheck?
I ran it 10 times, but only 2 passsed the gradcheck.
-- Gyeongsik Moon Ph.D. Candidate Department of ECE, SNU, Seoul, Korea http://cv.snu.ac.kr/ http://cv.snu.ac.kr/
From: longcw [mailto:notifications@github.com] Sent: Sunday, December 10, 2017 3:18 PM To: longcw/RoIAlign.pytorch RoIAlign.pytorch@noreply.github.com Cc: Gyeongsik Moon mks0601@gmail.com; Author author@noreply.github.com Subject: Re: [longcw/RoIAlign.pytorch] Hi gradcheck failed (#1)
I am not working on Mask RCNN. BTW, I found that this layer can pass the gradcheck if I set eps=1e-3. eps is the perturbation for finite differences.
gradcheck(roi_align, (image_torch, boxes, box_index), eps=1e-3)
output (max_val, min_error, max_error, mean_error):
('forward:', 0.87139809, 0.0, 7.0184469e-06, 5.5792748e-07) ('backward:', 1.0228419, 0.0, 1.7911196e-05, 9.7078487e-09) test ok
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/longcw/RoIAlign.pytorch/issues/1#issuecomment-350527886 , or mute the thread https://github.com/notifications/unsubscribe-auth/AM-Lu8lNUQkESuGHouF_jRgJrZluMwJiks5s-3ejgaJpZM4Q8J9C . https://github.com/notifications/beacon/AM-LuytAF_ONMjU1R8hR8Lw-xK1mVysYks5s-3ejgaJpZM4Q8J9C.gif
@mks0601 Try to modify the random input image:
# image_data = np.random.randn(batch_size, depth, im_height, im_width).astype(np.float32)
# =>
image_data = np.random.rand(batch_size, depth, im_height, im_width).astype(np.float32)
Sorry to say, but changing the input seems not good...
It shows the implemented roi align layer is working on the specific input form (or at least, it does not work on the specific input form such as randn).
Can you tell me why there exists that kind of error?
-- Gyeongsik Moon Ph.D. Candidate Department of ECE, SNU, Seoul, Korea http://cv.snu.ac.kr/ http://cv.snu.ac.kr/
From: longcw [mailto:notifications@github.com] Sent: Sunday, December 10, 2017 3:59 PM To: longcw/RoIAlign.pytorch RoIAlign.pytorch@noreply.github.com Cc: Gyeongsik Moon mks0601@gmail.com; Mention mention@noreply.github.com Subject: Re: [longcw/RoIAlign.pytorch] Hi gradcheck failed (#1)
@mks0601 https://github.com/mks0601 Try to modify the random input image:
image_data = np.random.rand(batch_size, depth, im_height, im_width).astype(np.float32)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/longcw/RoIAlign.pytorch/issues/1#issuecomment-350529347 , or mute the thread https://github.com/notifications/unsubscribe-auth/AM-Lux14lIa-OJC00cQamIRTF1-ooj1eks5s-4EvgaJpZM4Q8J9C . https://github.com/notifications/beacon/AM-Lu7T5Ij_Ja2EidOFyR9kOncNzcC-fks5s-4EvgaJpZM4Q8J9C.gif
I don't think this is the problem of the implementation. It's the problem we using gradcheck.
Changing randn
to rand
actually decreases the max value of inputs. It can always pass the check if eps > max(inputs)/500
, whatever the input is.
I don't know the real reason. You can check the gradcheck function and the source code if you want to figure out the reason for this problem.
Also, can you let me understand the result of your roi_align layer?
If I fed the input tensor as
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
(1x1x7x7)
to the roi_align layer (crop_height=3, crop_width=3)
with argument (xs = [[0,2]], ys = [[0,2]], nbox=1, nbatch=1)
then I think the output should be
0 1 2
0 1 2
0 1 2
.
But, the output of your implementation is different.
Did I understand the roi align in a wrong way?
-- Gyeongsik Moon Ph.D. Candidate Department of ECE, SNU, Seoul, Korea http://cv.snu.ac.kr/ http://cv.snu.ac.kr/
From: longcw [mailto:notifications@github.com] Sent: Sunday, December 10, 2017 4:55 PM To: longcw/RoIAlign.pytorch RoIAlign.pytorch@noreply.github.com Cc: Gyeongsik Moon mks0601@gmail.com; Mention mention@noreply.github.com Subject: Re: [longcw/RoIAlign.pytorch] Hi gradcheck failed (#1)
I don't think this is the problem of the implementation. It's the problem we using gradcheck. Changing randn to rand actually decreases the max value of inputs. It can always pass the check if eps > max(inputs)/500, whatever the input is.
I don't know the real reason. You can check the gradcheck function and the source code if you want to figure out the reason for this problem.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/longcw/RoIAlign.pytorch/issues/1#issuecomment-350531415 , or mute the thread https://github.com/notifications/unsubscribe-auth/AM-Lu9NjLL3PsXS4Kw3aF3VFACog-Zi4ks5s-45agaJpZM4Q8J9C . https://github.com/notifications/beacon/AM-Lu-9_cbPKx0Hv0WYJ-EkTnZS2pAp6ks5s-45agaJpZM4Q8J9C.gif
What you want is crop_and_resize
.
import numpy as np
import torch
from torch.autograd import Variable
from roi_align.roi_align import RoIAlign
def to_varabile(arr, requires_grad=False, is_cuda=True):
tensor = torch.from_numpy(arr)
if is_cuda:
tensor = tensor.cuda()
var = Variable(tensor, requires_grad=requires_grad)
return var
# inputs
is_cuda = False
image_data = np.tile(np.arange(7, dtype=np.float32), 7).reshape(7, 7)
image_data = image_data[np.newaxis, np.newaxis]
boxes_data = np.asarray([[0, 0, 2, 2]], dtype=np.float32)
box_index_data = np.asarray([0], dtype=np.int32)
image_torch = to_varabile(image_data, requires_grad=True, is_cuda=is_cuda)
boxes = to_varabile(boxes_data, requires_grad=False, is_cuda=is_cuda)
box_index = to_varabile(box_index_data, requires_grad=False, is_cuda=is_cuda)
# set transform_fpcoor to False is the crop_and_resize
roi_align = RoIAlign(3, 3, transform_fpcoor=False)
print(roi_align(image_torch, boxes, box_index))
output:
(0 ,0 ,.,.) =
0 1 2
0 1 2
0 1 2
[torch.cuda.FloatTensor of size 1x1x3x3 (GPU 0)]
If use RoIAlign in this implimentation:
# input
...
boxes_data = np.asarray([[0, 0, 3, 3]], dtype=np.float32)
...
roi_align = RoIAlign(3, 3, transform_fpcoor=True)
print(roi_align(image_torch, boxes, box_index))
output:
Variable containing:
(0 ,0 ,.,.) =
0 1 2
0 1 2
0 1 2
[torch.FloatTensor of size 1x1x3x3]
You can read more about the roialign here: https://github.com/ppwwyyxx/tensorpack/blob/6d5ba6a970710eaaa14b89d24aace179eb8ee1af/examples/FasterRCNN/NOTES.md https://github.com/ppwwyyxx/tensorpack/blob/6d5ba6a970710eaaa14b89d24aace179eb8ee1af/examples/FasterRCNN/model.py#L316
Oh, I misunderstood the code of yours. Thanks for clarifying. However, it is hard for me to understand the link you provided :(
Can you help me to understand the link you provided? I read the NOTES.md, however I cannot understand why crop_and_resize is different from roi_align except the input form (normalized/unnormalized coordinates?). Also, I cannot understand the code.
If I just set the boxes_data as [xmin, ymin, xmax+1, ymax+1] and set transform_fpcoor=True, then it seems works well so far. And can you let me know 'just set the boxes_data as [xmin, ymin, xmax+1, ymax+1]' is correct? Do the fpcoord stand for feature plane corodinates?
Crop_and_resize (bilinear sample assumes floating point coordinate (0.0, 0.0) is the same as pixel value (0, 0):
RoIAlign: split the RoI into crop_size
grids with the same size first, then bilinear sample the value for each grid:
To use crop_and_resize for RoIAlign, we shift the grids with -0.5
:
In your case, the crop is
Variable containing:
(0 ,0 ,.,.) =
0.0000 0.0000 0.0000
0.0000 0.5000 1.1667
0.0000 0.5000 1.1667
[torch.FloatTensor of size 1x1x3x3]
if you set bbox=[0, 0, 2, 2]
:
Great help. Thank you. So the difference arises from dividing roi into grids. Then, I think just using [xmin, ymin, xmax+1, ymax+1] can output desired value where each values are float coordinates. Is that right?
Sorry, but I still cannot understand the code. What is the input of your roi_align module? Let`s say the bounding box coordinate of roi is (xmin, ymin, xmax, ymax). Then, what is the input of your roi_align module?
spacing_w is function of x1-x0, not x1-x0+1. So, I think xmax and ymax should be ++. Also, I cannot understand why we have to subtract 0.5.
what if just
x0, y0, x1, y1 = tf.split(boxes, 4, axis=1)
nx0 = x0 / tf.to_float(image_shape[1] - 1) ny0 = y0 / tf.to_float(image_shape[0] - 1)
nx1 = x1 / tf.to_float(image_shape[1] - 1) ny1 = y1 / tf.to_float(image_shape[0] - 1)
return tf.concat([ny1, nx1, ny1, nx1], axis=1)
and transform_fpcoor = False?
Hi, @mks0601, you may try how to use it for MASK-RCNN at here: https://github.com/tensorboy/Pytorch_Mask_RCNN. :)
@tensorboy i couldn't access the link. Could you share a working link?
Crop_and_resize (bilinear sample assumes floating point coordinate (0.0, 0.0) is the same as pixel value (0, 0):
RoIAlign: split the RoI into
crop_size
grids with the same size first, then bilinear sample the value for each grid:To use crop_and_resize for RoIAlign, we shift the grids with
-0.5
:In your case, the crop is
Variable containing: (0 ,0 ,.,.) = 0.0000 0.0000 0.0000 0.0000 0.5000 1.1667 0.0000 0.5000 1.1667 [torch.FloatTensor of size 1x1x3x3]
if you set
bbox=[0, 0, 2, 2]
:
The pictures are missing... Would be great if you can reupload them :)
Hi thanks for sharing your implementation.
I want to use RoIAlign layer in my pytorch code, and I found your implementation. To verify your implementation, I ran the test.py and the gradcheck failed. Did you check the code?