Closed MeowZheng closed 3 years ago
Hi,
That's a good question, and It's just a design choice. At each pyramid level, the scale of the optical flow is at the original image resolution, not that of the downscaled image. (please note that the loss function uses the downscaled optical flow map without scaling the optical flow value.) So it doesn't need to rescale the flow value when upsampling the resolution.
If you want to design a decoder that outputs the downscaled flow map (both values and shape), you can revise the upsampling functions (that you cited above) as well as the loss function (i.e., using the GT flow map in which value and shape are both properly downscaled).
I think I've tried both settings and observed that it made only a marginal difference in the supervised learning setting.
Hopefully, this answers your question!
Best, Jun
Many thanks for your kind reply. Sorry I still felt confused about warping.
Could you answer the simplest question of whether the flow estimated at level N and will be used in level N-1 for warping is the spatial resolution at level N? If yes, I think the flow in level N must be rescaled the shape and value together before warping, as the coordinate is rescaled. If No, like PWC-Net estimates flow on the original scale at each level, it also rescales the value of flow for each level (ref), but your implementation didn't rescale the value. (we ignore the 0.05 factor in this discussion)
Thanks again for your patient. It means a lot to me.
Best regards, Meow
Hi Meow, Yes, that's correct indeed. It doesn't rescale for output, but it does rescale before warping.
It normalizes the flow and making it between [-1, 1], to use the warping function torch.functional.grid_sample
.
Basically it needs to do these two steps:
output flow -> rescaling to local scale -> divided by the height and width of the downscaled image.
but implementationwise it's the same as:
output flow -> divided by the height and width of the original image.
Best, Jun
OK, it's so clear at all! I understand your design!
Thank you Meow
Hi Junhwa,
Many thanks for you sharing this nice work. I have a little question for feature warping in PWC-Net in this repo and IRR-PWC.
Before warping feature, people always rescale the shape and value of the flow together, but you only upsample the flow shape, like https://github.com/visinf/irr/blob/4d7f6aa46d6989d7dcf8aa1213fbc64f0058e038/models/pwcnet.py#L68-L69 and https://github.com/visinf/irr/blob/4d7f6aa46d6989d7dcf8aa1213fbc64f0058e038/models/IRR_PWC.py#L82-L88
Could you please explain a little about these?
Best regards, Meow