tjiiv-cprg / EPro-PnP-v2

[TPAMI 2024] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
https://arxiv.org/abs/2303.12787
MIT License
125 stars 7 forks source link

Question about image shapes #9

Closed harsanyidani closed 2 months ago

harsanyidani commented 2 months ago

Hello! I have some questions abou image shapes:

Thanks in advance for the help!

Lakonik commented 2 months ago

Hi! Thanks for your interest in our work.

harsanyidani commented 2 months ago

Thanks for the quick reply @Lakonik !

CNNs are not restricted to certain resolutions, so the actual image size is 1600x672

Yes, for some reason I tought because of the pretrained version, that it is restricted. But now that I think about it I was wrong. But if this is the case then what happens to strides (for example 1600 % 128 != 0)?

For compatibility concerns, the images may be resized and zero-padded, so img_shapes, ori_shapes, batch_input_shape can be different (although for nuScenes they should be the same).

Understandable, thanks!

Lakonik commented 2 months ago

My understanding is that non-divisible shape is generally OK for CNN at deep layers, where spatial mismatch isn't that important.

harsanyidani commented 2 months ago

I see, seems logical. I was just concerned about stride's involvement in the algorithm and it being not always exact.