approximate joint training method problem

i am confused and i cant give my answer during any research. roi pooling involves non-differentiable operations like indexing(quantizing the coordinate(like 3.5) to integers(3)). However why we detaching the proposals? if we use approximate joint training method and dont use detach() for RPN outputs, during backpropagation, how the gradients do flow from the detector back into the RPN and feature extraction network? i dont uderstand this is unnecessary detaching proposal when gradients cant be flowing from roi pooling layer to rpn head and automatically are stoped. on other hand unlike roi align, outputs of roi pooling has not directly related with coordinates(proposals). (Actually, I did not find a mathematically relationship between roi_output and inputs(just coordinates part).) i.e mathematically relationship beetween roi-pool outputs and{x1,y1,x2,y2}. So again is not necessary detaching proposal when there is not relationship beetwen roi pooling output and coordinate inputs. i mean if d(roi_pool_outputs)/d{x1,y1,x2,y2} are not even exist why we should detach the {x1,y1,x2,y2} to become constant??

my problem is that (RPN_output.detach()) is unnecessary. because d(roi_pool_outputs)/d{x1,y1,x2,y2} are not even exist and gradients Automatically can not propagated from detector through the RPN.

i realy confused.

jwyang / faster-rcnn.pytorch

approximate joint training method problem #911