Open jbschlosser-zz opened 8 years ago
You can use this version of ROIPooling which is deterministic, but 50x slower for the backward pass.
Thanks! That version solves the non-determinism issues for me.
@jbschlosser what variance did you see in your results?
@jbschlosser be aware that the old deterministic version has bugs coming from the original implementation. v2 is different in forward pass
@szagoruyko That's good to know. I did notice differences in the output between the two versions. The outputs I see from the older deterministic version match more closely with those of the nnf.ROIPooling module I'm using from the fast-rcnn branch of the fmassa/object-detection.torch repo for CPU support. I assume both of those have the same bugs you mentioned (they're both v1)?
@ghostcow Here's a sample of an output coming out of multiple backward passes: 1.6036843061447 1.6036829948425 1.6036834716797 1.6036829948425 1.6036816835403
At the high level, these variations compounded to produce a half percent difference in measured average precision on my validation set between back-to-back runs, all else remaining constant.
@jbschlosser nnf.ROIPooling is not equivalent to v1 too. it handles rois that are partially outside of the image by clamping to image size, whereas inn.ROIPooling assumes that outside are zeros.
@szagoruyko they are equivalent if the bounding boxes lies inside the image region (which is the case for selective search bounding boxes). But indeed nnf
s truncates the boxes, whereas inn
s pads with zeros
I noticed that I wasn't getting repeatable training results and traced the problem to the backwards pass of the ROIPooling module. It appears it's non-deterministic. Why is this- is this expected? Is there anything that can be done to fix this?