Closed birdwcp closed 2 years ago
Yes, compared with Plain, the accuracy of Dense is only slightly higher, despite the speed is slower. But also the parameter usage is more efficient.
The advantage of DSOD mainly comes from the backbone design, not the prediction layers.
@liuzhuang13 thx , When I submitted this issue I found that I had made a mistake Just like you say , That's a difference between dense and plain prediction layers. I should compare the two lines: SSD300S† 07+12 ✗ VGGNet Plain 46 26.3M 300 ×300 69.6 DSOD300 07+12 ✗ DS/64-192-48-1 Plain 20.6 18.2M 300 ×300 77.3
In my experiment, ssd with VGG can also be trained to achieve 73% without pretrain, while in your paper, the result is only 69.6%
@lzx1413, thanks for pointing out this. I checked our log file. This is because in the original SSD, the batch size is 32, we only double it to 64 for both SSD and SSD (dense) from scratch. While I recommend to use 128 in another issue https://github.com/szq0214/DSOD/issues/9 for SSD from scratch. I think this is a reasonable improvement when increasing to 128. I will clarify this in our revision.
SSD300S† 07+12 ✗ VGGNet Plain 46 26.3M 300 ×300 69.6 SSD300S† 07+12 ✗ VGGNet Dense 37 26.0M 300 ×300 70.4
in the table 4 of your paper, Dense-ssd seems to be no advantage with VGG-ssd. similar precision but slower