Open mailcorahul opened 5 years ago
Hi, I have the same question. did you figure it out? 3Q
@yangyangyang127 No I couldn't figure it out yet. Even tried emailing the authors of EAST paper, but didn't get any response from them as well. Do you have any intuitions as to how this might work?
Try to read the icdar.py, and you will find the answer.
I was going through EAST paper and I am having a doubt on how exactly bounding boxes are computed. Basically after passing the input image through some set of convolutional layers, a 1x1xD filter is applied on the final conv volume to get W x H x 4 output volume where 4 channels are the offsets to top, left, bottom, right boundaries. My doubt is since we are looking at only one cell in the final feature map, how is it possible for the network to find offsets with respect to all four boundary points.
To make it more clear, let's say 1x1 filter is looking at top left grid in the final feature map for a text box in the image.
Can anyone explain me how this works? @argman