Question about scale bins

facebookresearch / OrienterNet

Source Code for Paper "OrienterNet Visual Localization in 2D Public Maps with Neural Matching"

Other

424 stars 40 forks source link

Question about scale bins #30

Open smj007 opened 9 months ago

smj007 commented 9 months ago

Hi,

Thank you for this piece of art, it’s so well thought out and I enjoyed going through your paper and the code. I just had a small question about the number of scale bins - I might have missed these details but I just wanted to clarify this anyway.

In the paper, it’s mentioned that 32 scale bins are used. The code however has an argument set to 33.

https://github.com/facebookresearch/OrienterNet/blob/47d67ab69a4b5c416237f7ef0976f6e6b586edbc/maploc/conf/orienternet.yaml#L13

Is this some sort of dummy parameter or does it have a significant meaning to it that I’m missing?

Thank you for your time!

sarlinpe commented 8 months ago

Hi, Thank you for your question and sorry for my late reply. We did experiment with an additional last dummy bin but this didn't make a difference in the end. This is probably why we chose 33 bins in the first place. The number mentioned in the paper is incorrect - though using 32 or 33 shouldn't make any difference. I hope this helps.

martin-liao commented 4 months ago

Thanks for your fantastic work! But I am confused why pads in the yaw dimension here before loss calculation. Is this a dummy bin as well?

sarlinpe commented 4 months ago

No, this padding allows to bilinearly interpolate at angles $\theta \in [360\frac{R-1}{R},360]\degree$ (for $R$ rotation steps), which combines values at indices $R{-}1$ and 0. Since grid_sample does not support circular padding (only zero padding), we explicitly pad its input.

martin-liao commented 4 months ago

Thanks for your quickly reply! I have double-checked code in Template class here (and L38) and found the sampling positions are ${0\frac{R-1}{R} ,1\frac{R-1}{R},...,360\frac{R-1}{R}}$, so the latest bin $[360\frac{R-1}{R},360]$ would be neglected without padding, right? One more question: the following line normalize the GT value of xz with (size-1) instead of size . The notification is align_corners=True. I was wondering that:

$[u,v] = s * [x,-z] - 0.5$ ($[u,v]$ in BEV Cartesian coordinate system, $[x,z] in canvas coordinate system);
each element of score volume locates in the center of the grid (align_corners=False). So we discard the outermost half grid ($[min,min+\frac{g}{2}], [max-\frac{g}{2}],max]$) to align the score to the corner?