Closed BebDong closed 4 years ago
You may misunderstand what I was trying to say. By saying with 11.8M parameters, I mean the number of parameters instead of the memory consumption on the disk.
Sorry...
The ResNet-18 has about 10.6M, “ we can see there are at least three extra 1x1 Convs and three 3x3 Convs, and even an SPP module.” Their channels is small.
The biggest problem is that I don't know how to train. I can only get 71.1% mIoU in 1024 * 2048.
@BebDong code for counting the models parameters is provided here. Feel free to double check the reported number of parameters. Pseudo code for measuring model inference speed is provided in the conference paper(Fig. 5)
@dxjundersky training code will be released before Dec 18th.
@orsic Thank you very much!
I am wondering how you obtain a SwiftNetRN-18 model with only 11.8M parameters? Even the most original ResNet-18 based FCN32s for Cityscapes dataset contains near 11.8M parameters. From the paper In Defense of Pre-trained ImageNet Architectures for Real-time Semantic Segmentation of Road-driving Images, we can see there are at least three extra 1x1 Convs and three 3x3 Convs, and even an SPP module.
Besides, with the U-shape structure, it's unlikely to gain speed of over 39 FPS with 2048x1024 input on a single GTX 1080Ti.
Would you please offer the skills too, except for only the codes to reproduce the mIoU, to get the speed and model compression performance?