ROIPooling module backwards pass is non-deterministic

szagoruyko / imagine-nn

IMAGINE torch neural network routines

Other

109 stars 35 forks source link

ROIPooling module backwards pass is non-deterministic #27

Open jbschlosser-zz opened 8 years ago

jbschlosser-zz commented 8 years ago

I noticed that I wasn't getting repeatable training results and traced the problem to the backwards pass of the ROIPooling module. It appears it's non-deterministic. Why is this- is this expected? Is there anything that can be done to fix this?

fmassa commented 8 years ago

You can use this version of ROIPooling which is deterministic, but 50x slower for the backward pass.

jbschlosser-zz commented 8 years ago

Thanks! That version solves the non-determinism issues for me.

ghostcow commented 8 years ago

@jbschlosser what variance did you see in your results?

szagoruyko commented 8 years ago

@jbschlosser be aware that the old deterministic version has bugs coming from the original implementation. v2 is different in forward pass

jbschlosser-zz commented 8 years ago

@szagoruyko That's good to know. I did notice differences in the output between the two versions. The outputs I see from the older deterministic version match more closely with those of the nnf.ROIPooling module I'm using from the fast-rcnn branch of the fmassa/object-detection.torch repo for CPU support. I assume both of those have the same bugs you mentioned (they're both v1)?

@ghostcow Here's a sample of an output coming out of multiple backward passes: 1.6036843061447 1.6036829948425 1.6036834716797 1.6036829948425 1.6036816835403

At the high level, these variations compounded to produce a half percent difference in measured average precision on my validation set between back-to-back runs, all else remaining constant.

szagoruyko commented 8 years ago

@jbschlosser nnf.ROIPooling is not equivalent to v1 too. it handles rois that are partially outside of the image by clamping to image size, whereas inn.ROIPooling assumes that outside are zeros.

fmassa commented 8 years ago

@szagoruyko they are equivalent if the bounding boxes lies inside the image region (which is the case for selective search bounding boxes). But indeed nnfs truncates the boxes, whereas inns pads with zeros