lmb-freiburg / flownet2

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
https://lmb.informatik.uni-freiburg.de/Publications/2017/IMKDB17/
Other
1k stars 318 forks source link

Why multiply flow groundtruth data by 32 when creating LMDB? #35

Closed xmfbit closed 7 years ago

xmfbit commented 7 years ago

Sorry, it is about the code itself, not for building error report. But I didn't find any forums to discuss flownet for the researcher. So I have to post it here.

In the file tools/convert_imageset_and_flow.cpp, I found that the ground truth value of optical flow is multiplied by 32 when creating LMDB. See https://github.com/lmb-freiburg/flownet2/blob/master/tools/convert_imageset_and_flow.cpp#L176 . And I found that in layers/custom_data_layer.cpp, 32 is divided. OK, just for storing in LMDB, no problem.

I understand that you were to convert the flow data into int16 data type to feed it to the LMDB file. It is the number 32 that confuses me. Why you used 32 as the ratio? I print some ground truth value of optical flow:

E0619 16:04:31.712563 18609 convert_imageset_and_flow.cpp:148] transformed value data: 28
E0619 16:04:31.712574 18609 convert_imageset_and_flow.cpp:147] previous flow data: 0.905912
E0619 16:04:31.712585 18609 convert_imageset_and_flow.cpp:148] transformed value data: 28
E0619 16:04:31.712596 18609 convert_imageset_and_flow.cpp:147] previous flow data: 0.906365
E0619 16:04:31.712608 18609 convert_imageset_and_flow.cpp:148] transformed value data: 29
E0619 16:04:31.712642 18609 convert_imageset_and_flow.cpp:147] previous flow data: 0.906817
E0619 16:04:31.712658 18609 convert_imageset_and_flow.cpp:148] transformed value data: 29
E0619 16:04:31.712671 18609 convert_imageset_and_flow.cpp:147] previous flow data: 0.90727
E0619 16:04:31.712682 18609 convert_imageset_and_flow.cpp:148] transformed value data: 29
E0619 16:04:31.712694 18609 convert_imageset_and_flow.cpp:147] previous flow data: 0.907722
E0619 16:04:31.712707 18609 convert_imageset_and_flow.cpp:148] transformed value data: 29
E0619 16:04:31.712725 18609 convert_imageset_and_flow.cpp:147] previous flow data: 0.908175
E0619 16:04:31.712738 18609 convert_imageset_and_flow.cpp:148] transformed value data: 29
E0619 16:04:31.712750 18609 convert_imageset_and_flow.cpp:147] previous flow data: 0.908627
E0619 16:04:31.712764 18609 convert_imageset_and_flow.cpp:148] transformed value data: 29

Considering that the range of int16 is -2^15~2^15, why not to multiply a lager number to reduce the round error in the transformation? Are there some special reasons for that?

nikolausmayer commented 7 years ago

A good question. I think there was no particular reason for the exact value "32". However, keep in mind that 2^15 is only 32768, so with a factor of 32 we are already limited to a flow maximum of 1024. You are right that larger factors would preserve more details, but any larger factor would be too much of a constraint (such large flows are rare, but not implausible).

xmfbit commented 7 years ago

@nikolausmayer Thanks for your response. In fact, I observed that the performance decreased slightly when I used 1024 as the multiply factor instead of 32 with the same experiment configuration.