Closed Etienne66 closed 2 years ago
Don't forget that the input to backwarp isn't normalized images but features - and I also appended an additional channel: https://github.com/sniklaus/pytorch-pwc/blob/7023ad121644cd460fc72c8bb38a9fa988fe7398/run.py#L57
Anyways, the reason I added this additional channel and why I mask the output based on this auxiliary channel is because the goal of this repository is to mimic the original Caffe implementation in PyTorch. That includes being able to use the weights from the model trained in Caffe and executing it in PyTorch. For that to work, all details need to be the same. This being said, the original Caffe implementation uses a custom backwarp implementation that only backwarps pixels if all four pixels that are being sampled reside inside of the image: https://github.com/NVlabs/PWC-Net/blob/185b0e2beb45ad029bb66d818812f8dcc2aed9c6/Caffe/warping_code/warp_layer.cu#L54-L85
The auxiliary channel/mask that I introduced is used to zero out pixels that sample from the boundary where the original implementation yields zero but PyTorch's grid sampler yields something different. If you train a PWC-Net from scratch then you can safely remove this mechanism.
Thanks @sniklaus, I'm training from scratch and I did notice a slight improvement in the losses when I removed that. Glad to know it is safe to remove for a fresh training.
If I'm reading this correctly it is looking at the final channel that started as all ones and making sure the values are greater than 0.999 and then removing the final channel as it applies the mask to the rest of the channels and returns the original number of channels.
It makes some sense if you are assuming that the images are all 0 to 1 but I have normalized my input and it is about -2 to 2 so I'm wondering if I should remove the mask.
Thanks in advance, Etienne66