lliuz / ARFlow

The official PyTorch implementation of the paper "Learning by Analogy: Reliable Supervision from Transformations for Unsupervised Optical Flow Estimation".
MIT License
251 stars 50 forks source link

What is the recommended test_shape value? #34

Open NagabhushanSN95 opened 3 years ago

NagabhushanSN95 commented 3 years ago

I'm trying to apply ARFlow on Sintel database, which has images of shape (436, 1024). What is the recommended test shape value?

For the example given, images are of shape (375, 1242), but the used test shape is (384,640). I didn't understand how you got that value

NagabhushanSN95 commented 3 years ago

My bad. You've mentioned 448x1024 for Sintel. But still, I would like to know if there is a heuristic behind choosing this size. Like for eg, what test_shape would choose for frames of shape (240,320) or (1080,1920)?

jeffbaena commented 3 years ago

Dear @NagabhushanSN95, thanks for pointing it out, It helped me to let the network run on Sintel. My guess is that the input shape should be diadic (divisible by 2) so to be sure the filters span over the entire frames.

NagabhushanSN95 commented 3 years ago

Not just 2. Shape should be divisible by 32. But i dont see a pattern in how test_shape is chosen

jeffbaena commented 3 years ago

yes sorry you are right, it should not be just two. To be honest I don't know the exact minimum divider. Anyway the number makes sense for Sintel. the original size 436,1024 is not divisibile by 32 without remainder, whereas 448,1024 is divisible by 32 without remainder.

If you look at other repositories, similar values have been used, e.g. https://github.com/princeton-vl/RAFT

I hope this helps

NagabhushanSN95 commented 3 years ago

Oh! Okay. Thanks. But I'm planning to use ARflow on UCF-101 dataset, whose resolution is 320x240. I'm wondering if 320x256 is a good value for test_shape or should it be something else

jeffbaena commented 3 years ago

I am not the author of this paper, but in my view it should be ok. However if you use the pretrained model you should be careful...If I am not mistaken, I have found that the pretrained models present a very high epe https://github.com/lliuz/ARFlow/issues/35

NagabhushanSN95 commented 3 years ago

Yes. With pre-trained models, I got the best reconstruction error when using test_shape (448, 1024) only. But when training, I don't see a point of blowing up (240,320) frames to (448, 1024). But I've also read at some places that blowing up helps. So, I wanted to know if the authors have some intuition or heuristic for selecting the test_shape