how to calculate models fps in paper

Hello! first of all, thank you for writing good paper. I have a question about model inference speed and Input Size.

in the paper, Only in Cityscapes,

FCN (MobileNetV2 Encoder)- 14.2fps PSPNet (MobileNetV2 Encoder)- 11.2fps DeepLabv3+ (MobileNetV2 Encoder) - 8.4fps SegFormer B0 - 76.2fps

I think you use V100 single machine for test.

I calculate inference time in RTX 3090 again < python tools/benchmark.py configs/mobilenet_v2/fcn_m-v2-d8_512x1024_80k_cityscapes.py > < python tools/benchmark.py configs/mobilenet_v2/pspnet_m-v2-d8_512x1024_80k_cityscapes.py > < python tools/benchmark.py configs/mobilenet_v2/deeplabv3_m-v2-d8_512x1024_80k_cityscapes.py >

I upload deeplabv3 test image

in paper, DeepLabv3+(mobilenetV2) is 8.4fps, but when i check inference time, it is 15.24fps. I check image shape before backbone forward, it is (3, 1024, 2048) FCN in 3090 - 20.52 fps PSP in 3090 - 18.12 fps

how FCN, PSPNet, .. models calculate inference time? why is it so different?

And I check inference time SegFormer B0 too, It is 13.68fps < python tools/benchmark.py local_configs/segformer/B0/segformer.b0.1024x1024.city.160k.py > and i check image shape before backbone(mix-transformer) forward, it is (3, 1024, 1024) image data in dataloader shape is (3, 1024, 2048) the image is cropped before inference in SegFormer?

I know the V100 and RTX3090 has different performance, but it is so weird. FCN, PSPNet, .. is speed up in 3090, but SegFormer B0 is speed down in 3090?

NVlabs / SegFormer

how to calculate models fps in paper #118