Closed gooners1886 closed 7 years ago
Hi,
The evaluation script for the FlowNet2 re-evaluates the network if the result contains "NaN" values. We currently do not know why the network sometimes produces NaNs, but we wanted to ensure that the scripts produce good results. You can edit the Python scripts to remove the checking loop (near the end).
Are you running the script for a single image pair or the one for lists of pairs? The former is significantly slower.
Your GPU has the Kepler architecture. It's possible that many optimizations (especially in CUDA and CuDNN) are not available for this architecture and the whole thing runs slower.
(we did our time measurements on a GTX 1080 which is roughly 3x as fast as the K40 in terms of FLOPs, but since your images are ~60% smaller, this should not make a huge difference)
All in all, I would not expect a slowdown by such a large factor if you are using the lists-of-images script, but the scripts we provide are also not optimized for speed.
@nikolausmayer Thank you for your advice. I am using run-flownet-many.py which is the option for lists of pair. I found that the process of loading caffe and net is in this for loop: for ent in ops: (whichout any if conditions)
Does it mean that we should load caffe model once for each image pair ?
I am testing moving the process of loading caffe and net into the if condition statement(the code line is :
if width != input_data[0].shape[3] or height != input_data[0].shape[2]:), i found in this way, for the whole image list, it loads caffe and net only once, not once for each pair, and the running speed for each pair become about 1.3 second(except for the first image pair which need to load caffe and net). Is this a good way to speed up or will this change cause other error in the furture?
It should not be a problem as long as you take care that all input images are the same size. Removing the NaN-checker loop, well, that might produce NaNs sometimes. If whatever you use the flowfields for can deal with an occasional NaN, that's fine!
about 5 second for 300*100 image on GTX1080Ti, it is not fast
@my1347 have you tried what https://github.com/lmb-freiburg/flownet2/issues/105 suggests? A 1080Ti can be much faster than 5 seconds, but not if you tear down and recreate the network for each image pair. Note that our scripts are not optimized for speed.
I run the code using Tesla K40 GPU on CentOS. When I use the FlowNet2 model, it runs about 7-8 seconds per image pair. But it is said it should be 100-200 ms per image pair. My test imges are all 640x360. So why it runs that slow?