chenjun2hao / DDRNet.pytorch

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data
Other
161 stars 41 forks source link

关于FPS #12

Closed zhuzhuzhu2 closed 3 years ago

zhuzhuzhu2 commented 3 years ago

大佬您好,我用了一块RTX 2080s测试了一下DDRNet-23slim的FPS,与论文中的数据相差非常大。想请问一下大佬没有没测试过FPS? 并且我得到的DDRNet-23slim的模型参数文件大小为23M,与论文中的5.7M相差也甚远,请问大佬知道这是怎么一回事吗? 非常期待您的回答

image

ydhongHIT commented 3 years ago

大佬您好,我用了一块RTX 2080s测试了一下DDRNet-23slim的FPS,与论文中的数据相差非常大。想请问一下大佬没有没测试过FPS? 并且我得到的DDRNet-23slim的模型参数文件大小为23M,与论文中的5.7M相差也甚远,请问大佬知道这是怎么一回事吗? 非常期待您的回答

image

What about your FPS? Note that the model should be moved to GPU and excluding the bn (just set the bias of conv to 'True' and remove the bn for speed test). Just use the code in 'https://github.com/VITA-Group/FasterSeg/blob/master/tools/utils/darts_utils.py#L184'. The 2080Ti should be a desktop version.

ydhongHIT commented 3 years ago

大佬您好,我用了一块RTX 2080s测试了一下DDRNet-23slim的FPS,与论文中的数据相差非常大。想请问一下大佬没有没测试过FPS? 并且我得到的DDRNet-23slim的模型参数文件大小为23M,与论文中的5.7M相差也甚远,请问大佬知道这是怎么一回事吗? 非常期待您的回答

image

参数文件大小为23M是正常的,5.7M是数学意义上的网络参数数量。

zhuzhuzhu2 commented 3 years ago

大佬您好,我用了一块RTX 2080s测试了一下DDRNet-23slim的FPS,与论文中的数据相差非常大。想请问一下大佬没有没测试过FPS? 并且我得到的DDRNet-23slim的模型参数文件大小为23M,与论文中的5.7M相差也甚远,请问大佬知道这是怎么一回事吗? 非常期待您的回答 image

What about your FPS? Note that the model should be moved to GPU and excluding the bn (just set the bias of conv to 'True' and remove the bn for speed test). Just use the code in 'https://github.com/VITA-Group/FasterSeg/blob/master/tools/utils/darts_utils.py#L184'. The 2080Ti should be a desktop version.

Thanks very much for your reply.
The FPS I mean is the prediction speed of DDRNet-23slim. And I tried to exclude the bn, but after I removed the bn, the prediction result changed to be very bad. So, do you mean I should remove all the bn in the model when I do the speed test?

ydhongHIT commented 3 years ago

大佬您好,我用了一块RTX 2080s测试了一下DDRNet-23slim的FPS,与论文中的数据相差非常大。想请问一下大佬没有没测试过FPS? 并且我得到的DDRNet-23slim的模型参数文件大小为23M,与论文中的5.7M相差也甚远,请问大佬知道这是怎么一回事吗? 非常期待您的回答 image

What about your FPS? Note that the model should be moved to GPU and excluding the bn (just set the bias of conv to 'True' and remove the bn for speed test). Just use the code in 'https://github.com/VITA-Group/FasterSeg/blob/master/tools/utils/darts_utils.py#L184'. The 2080Ti should be a desktop version.

Thanks very much for your reply. The FPS I mean is the prediction speed of DDRNet-23slim. And I tried to exclude the bn, but after I removed the bn, the prediction result changed to be very bad. So, do you mean I should remove all the bn in the model when I do the speed test?

If you want to keep the prediction results, you can refer to https://github.com/pytorch/pytorch/blob/master/torch/nn/utils/fusion.py. The 109 FPS is measured with an overclocked GPU but you can get 102 FPS with a normal 2080Ti. I will update the FPS in the next version of the paper.

ydhongHIT commented 3 years ago

大佬您好,我用了一块RTX 2080s测试了一下DDRNet-23slim的FPS,与论文中的数据相差非常大。想请问一下大佬没有没测试过FPS? 并且我得到的DDRNet-23slim的模型参数文件大小为23M,与论文中的5.7M相差也甚远,请问大佬知道这是怎么一回事吗? 非常期待您的回答 image

What about your FPS? Note that the model should be moved to GPU and excluding the bn (just set the bias of conv to 'True' and remove the bn for speed test). Just use the code in 'https://github.com/VITA-Group/FasterSeg/blob/master/tools/utils/darts_utils.py#L184'. The 2080Ti should be a desktop version.

Thanks very much for your reply. The FPS I mean is the prediction speed of DDRNet-23slim. And I tried to exclude the bn, but after I removed the bn, the prediction result changed to be very bad. So, do you mean I should remove all the bn in the model when I do the speed test?

Use tools in https://github.com/NVIDIA-AI-IOT/torch2trt to convert the model into tensorRT and you can achieve 135FPS with ddrnet_23_slim.

zhuzhuzhu2 commented 3 years ago

Thanks very much. With your advice, I got the FPS closed to the FPS in the paper. DDRNet is really great network. Thanks for your kindly help.

subake commented 3 years ago

@ydhongHIT , @zhuzhuzhu2 , Please, can you provide more detailed instructions or maybe file example how to deactivate BatchNorm? I managed to fuse Conv and BatchNorm everywhere, except DAPPM and SegHead modules. I got a maximum of 80FPS with tensor size [1, 3, 1024, 2048]

zhuzhuzhu2 commented 3 years ago

I am sorry that I used a different GPU. So I think my result maybe could not help you. But actually, I found the fusion of Conv and BatchNorm could only improve a little bit of the speed. Maybe my program of fusion is also not good.

Bruce-Si commented 2 years ago

大佬您好,我用了一块RTX 2080s测试了一下DDRNet-23slim的FPS,与论文中的数据相差非常大。想请问一下大佬没有没测试过FPS? 并且我得到的DDRNet-23slim的模型参数文件大小为23M,与论文中的5.7M相差也甚远,请问大佬知道这是怎么一回事吗? 非常期待您的回答 image

What about your FPS? Note that the model should be moved to GPU and excluding the bn (just set the bias of conv to 'True' and remove the bn for speed test). Just use the code in 'https://github.com/VITA-Group/FasterSeg/blob/master/tools/utils/darts_utils.py#L184'. The 2080Ti should be a desktop version.

Thank you for your great work. I tried the code for speed test. The latency is about 10ms for a frame in gpu v100. That's great. But when I use model(input) for a real video, the latency ranges from 10ms to 200ms in the same environment, which is confusing... Have you met the same problem?