Closed twangnh closed 2 years ago
Hi @twangnh, the Cost-performance trade-offs curves in Figure 6 should answer most of your question, although not exact :) So we compared the performances at different downsampling resolutions (as the Y-axis), but we use the "fraction of sampled pixels (low-resolution image size) from original (high-resolution image size)" as a measure of inference cost (as the X-axis) rather than inference time. We didn't measure inference time but it should be proportional to the low-resolution image size.
Hi Jin @lxasqjc , thanks for sharing the work and providing timely reply, I have a question on practical usage, have you compared the method to uniform sampling with higher resolution? For example, the method downsamples the image to 64x128 and achieves performance a with an inference time of t1, and we can also have another baseline that uniformly downsamples the image to a higher resolution (e.g., 80x160, or 100x200, or 128x256) and achieves performance b with inference time of t2, would b be larger higher than a while t2 is less or comparable to t1? I ask this question as the added saliency map prediction and the whole deform sampling process would also give some cost in inference.
Thanks in advance and looking forward to your reply!