Closed abinjoabraham closed 5 years ago
Yes, the output resolution is the resolution in which the predicted polygon points live. They are rescaled to original resolution for all purposes later.
Still, I would like to go with one more lame question in the follow-up of the above question :) @amlankar . I am considering the first version polygonrnn with an output resolution of 28x28. How the one-hot encoding of this vertex predicted in this scenario look like?? Is that a 1D value between 0 - 783(28*28) or value between 0-27. I have gone through a couple of blogs which explains about the one-shot encoding and most of them were explaining about the conversion of text data to one-hot encoding. Couldn't really find something which explains the one hot encoding of the 2D data with a vertex as in this case.
I was going through the polygonrnn and polygonnrnnpp paper once again and a question was striking me about the output resolution. 28x28 output resolution in the first version and 128x128 output resolution in the second version. Is that the resolution of the grid size which fits the cropped image of res 224x224?? Does it mean like the cropped image res 224x224 is downscaled to the grid-size so the grid points fits over it. I saw there was some functions which converts the grid values to the coordinates value. But was a little bit confused on the concept.