Question about table 9 in paper

csrhddlam / axial-deeplab

This is a PyTorch re-implementation of Axial-DeepLab (ECCV 2020 Spotlight)

https://arxiv.org/abs/2003.07853

Apache License 2.0

447 stars 69 forks source link

Question about table 9 in paper #23

Closed CoinCheung closed 3 years ago

CoinCheung commented 3 years ago

Hi,

Thanks for the work, I noticed from the table 9 in the paper that the performance is relatively stable no matter if the output stride is 16 or 32 and no matter if the decoder is axial decoder. Have you noticed this in practice, and does this mean that we can simply use output stride of 32 without axial decoder which will make the model much light-weighted ?

csrhddlam commented 3 years ago

Output stride 16 vs. 32: Yes, we noticed it in practice. However, we also noticed that this conclusion does not generalize to COCO. If I remember correctly, output stride 16 is better than 32 on COCO.

Axial-Decoder vs. Conv-Decoder: They do perform similarly. Separable conv seems good at decoding (and is lightweight).

CoinCheung commented 3 years ago

Thanks for telling me this. By the way, the code only supports input size of 224 x 224, and if I want input to have other resolutions, I should modify the associated code, right ?

csrhddlam commented 3 years ago

Right, you might need to pass the resolution into the model as kernel sizes.