geometry is a (1, m, n, 5) array of floats which according to the paper should contain 4 distances from the pixel location to the top, right, bottom, left boundaries of the rectangle and the rotation angle respectively, I presume. I tried visualizing the geometry output however, it looks like something is different as by examining the resulting figure, I get noisy boxes not spread over text regions (which indicates that my top, right, bottom, left theory is incorrect). Can you please clarify what this geometry thing contains?
In
run_demo_server.py
L100, there is:geometry
is a(1, m, n, 5)
array of floats which according to the paper should contain 4 distances from the pixel location to the top, right, bottom, left boundaries of the rectangle and the rotation angle respectively, I presume. I tried visualizing the geometry output however, it looks like something is different as by examining the resulting figure, I get noisy boxes not spread over text regions (which indicates that my top, right, bottom, left theory is incorrect). Can you please clarify what this geometry thing contains?