I'm not sure if the paper is correct or if the model here on github is correct, but there is a discrepancy between the two. The code says the pooled width and height should be 14x14, however the paper claims it should be 28x28:
We expect the RoI warping layer to produce a sufficiently
fine resolution, which is set as W' × H' = 28 × 28 in this
paper. A max pooling layer is then applied to produce a
lower-resolution output, e.g., 7×7 for VGG-16.
Am I interpreting something wrong, or is there a reason for this?
I'm not sure if the paper is correct or if the model here on github is correct, but there is a discrepancy between the two. The code says the pooled width and height should be 14x14, however the paper claims it should be 28x28:
Am I interpreting something wrong, or is there a reason for this?