visinf / irr

Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation (CVPR 2019)
Apache License 2.0
194 stars 32 forks source link

Changing Network Structure #25

Closed AliKafaei closed 4 years ago

AliKafaei commented 4 years ago

Hi, I really appreciate your work and believe that it has a high impact on the field. I have high-frequency data and I have used original PWC-Net. Due to the presence of two downsampling (The final output is 1/4 input), the results were not accurate enough. In my previous paper, I removed the stride=2 and used stride=1. The results were improved significantly but the problem is that the network cannot estimate high displacement anymore. To solve it, I want to have 5 output levels instead of 4 and I have seen were to change but just changing number of outputs does not work. I want to know what else I need to change. (I use PWC-Net+irr which does not have occlusion detection and upsampling layer) Thanks in advance for your response.
Regards, Ali

hurjunhwa commented 4 years ago

Hi Ali, How about using two more decoders so that it can estimate up to the original resolution? You can try with the original PWC-Net reimplementation: https://github.com/visinf/irr/blob/f0eba07773806941853d7cb7f7ad28307a288a20/models/pwcnet.py#L17 with self.output_level=6.

If you want to use PWC-Net + irr, you need to additionally add 1x1 convolution layer for the 2 more layers. https://github.com/visinf/irr/blob/f0eba07773806941853d7cb7f7ad28307a288a20/models/pwcnet_irr.py#L31 Best, Jun

AliKafaei commented 4 years ago

Thanks for your reply. Due to robustness issue 1/2 resolution of input sounds enough for now. I think, I also need to add weight to the loss function. In original PWC-Net (implementation of the authors), there are 5 output levels but in your implementation, there are 4 output levels. I cannot understand why there is such a difference. We can see that the loss function has 5 weights: self._weights = [0.005, 0.01, 0.02, 0.08, 0.32] but there are 4 output levels. Best, Ali

On Thu, Aug 6, 2020 at 8:58 AM Junhwa Hur notifications@github.com wrote:

Hi Ali, How about using two more decoders so that it can estimate up to the original resolution? You can try with the original PWC-Net reimplementation:

https://github.com/visinf/irr/blob/f0eba07773806941853d7cb7f7ad28307a288a20/models/pwcnet.py#L17 with self.output_level=6.

If you want to use PWC-Net + irr, you need to additionally add 1x1 convolution layer for the 2 more layers.

https://github.com/visinf/irr/blob/f0eba07773806941853d7cb7f7ad28307a288a20/models/pwcnet_irr.py#L31 Best, Jun

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/visinf/irr/issues/25#issuecomment-669910553, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIUUQR74YZR6LAQUK3EAAPDR7KSGVANCNFSM4PUZMPCA .

hurjunhwa commented 4 years ago

Well, sorry if the code made you confused. The self.output_level = 4 means that iterating the for-loop for the pyramid level [0, 1, 2, 3, 4], which is the same 5 levels. I guess it would be much clearer if you can try to read the code...