Closed JUGGHM closed 1 year ago
Hi JUGGHM, The answer is no. Though 3x3 conv could be faster than 7x7, multi-level fusion can bring in much more memory cost. And revcol is deeper and narrower than convnext/swin/vit, this could be slow.
Here is a similar question from reviewer and our replies.
"Real throughput/latency needs to be measured to more accurately validate the model budget, not just FLOPs or params. The introduced connections seem to introduce larger latency on real hardware which is not so related to FLOPs numbers."
Model | #Blocks | Latency/ms | ΔΔ |
---|---|---|---|
ConvNeXt-L | 3,3,27,3 | 78.3 | |
ConvNeXt-L (deep) | 8,16,48,16 | 100.5 | 0% |
RevCol-L | 8,16,48,16 | 119.9 | 19.89% |
RevCol-L - upsample | 8,16,48,16 | 111.8 | 11.79% |
RevCol-L - upsample - downsample | 8,16,48,16 | 103.8 | 3.79% |
Thank you for your detailed reply!
Thank you for your impressive work Tsai! I am wondering whether there are any latency comparisons against other convnet/transformer models? Since the network is built by efficient 3x3 convolution and linear operators, it is expected to have better throughputs.