vacancy / PreciseRoIPooling

Precise RoI Pooling with coordinate gradient support, proposed in the paper "Acquisition of Localization Confidence for Accurate Object Detection" (https://arxiv.org/abs/1807.11590).
MIT License
772 stars 152 forks source link

Coordinates reference frame? #17

Closed CaptainDredge closed 5 years ago

CaptainDredge commented 5 years ago

I've two queries:

  1. What is the origin w.r.t coordinates are measured for ex. bottom left of image
  2. what x0,y0 and x1,y1 coordinate represent, Generally, x0,y0 is the top-left and x1,y1 is the bottom-right in other roi's implementation but I guess here it is bottom left and top right?

Can anyone please help!

vacancy commented 5 years ago

@Prabhat-IIT

  1. The origin is the top-left corner of the image.
  2. (x0, y0) is the top-left while (x1, y1) is the bottom right. The implementation is consistent with other roi's implementation.

Code references: https://github.com/vacancy/PreciseRoIPooling/blob/master/src/prroi_pooling_gpu_impl.cu#L171-L174

I also have a small snippet which demonstrates the forward propagation. https://github.com/vacancy/PreciseRoIPooling/blob/master/pytorch/tests/test_prroi_pooling2d.py#L21-L35 I hope it is useful for you to make sure about the coordinate system, as it's always confusing LOL.

CaptainDredge commented 5 years ago

Thanks @vacancy :smile: . Yeah, coordinate system is pretty confusing and here I got marvelous results with my localization task by considering bottom left as origin and (x0,y0) being bottom left and (x1,y1) being top right. Deep learning sometimes feel really unpredictable :stuck_out_tongue: