lmb-freiburg / flownet2

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
https://lmb.informatik.uni-freiburg.de/Publications/2017/IMKDB17/
Other
1k stars 318 forks source link

run-flownet-many.py cannot find caffemodel.h5 #120

Closed agilebean closed 6 years ago

agilebean commented 6 years ago

I got Caffe for flownet2 installed on Ubuntu 16.04. Calling flownet2::run-flownet-many.py from this torch implementation by: bash stylizeVideo_flownet.sh zoo1.mp4 ./models/checkpoint-mosaic-video.t7 stops due to unfound caffemodel in the flownet2 folder:

Starting optical flow computation...
Traceback (most recent call last):
  File "/home/rstudio/flownet2/scripts/run-flownet-many.py", line 21, in <module>
    if(not os.path.exists(args.caffemodel)): raise BaseException('caffemodel does not exist: '+args.caffemodel)
BaseException: caffemodel does not exist: /home/rstudio/flownet2/FlowNet2/FlowNet2_weights.caffemodel.h5/

However - and this makes it really hard to understand - the file does exist at that path:

rstudio@demo3:~/fast-artistic-videos$ cd /home/rstudio/flownet2/FlowNet2
rstudio@demo3:~/flownet2/FlowNet2$ ls -al
total 638688
drwxr-xr-x  2 rstudio rstudio      4096 Apr 25  2017 .
drwxrwxr-x 33 rstudio rstudio      4096 Apr  5 11:23 ..
-rw-r--r--  1 rstudio rstudio     62798 Apr 25  2017 FlowNet2_deploy.prototxt.template
-rw-r--r--  1 rstudio rstudio     69448 Apr 25  2017 FlowNet2_train.prototxt.template
-rw-r--r--  1 rstudio rstudio 653868648 Apr 25  2017 FlowNet2_weights.caffemodel.h5

Why is that??? So I guess the question is how does args.caffemodel get set by the fast-artistic-videos script that calls it. Do I need to set certain environment variables so flownet can find the model even though the path is correct?

agilebean commented 6 years ago

I just found the answer... The reason flownet2 didn't find the model was the trailing slash "/" - this made it look for a directory, not a file. After correcting that, I have the next problem: Flownet complains about not finding the prototxt file:

Traceback (most recent call last):
  File "/home/rstudio/flownet2/scripts/run-flownet-many.py", line 22, in <module>
    if(not os.path.exists(args.deployproto)): raise BaseException('deploy-proto does not exist: '+args.deployproto)
BaseException: deploy-proto does not exist: /home/rstudio/flownet2/Flownet2/FlowNet2_deploy.prototxt
agilebean commented 6 years ago

So I corrected the root cause that the calling script looks for the correct file "/FlowNet2_deploy.prototxt.template". Nevertheless, the issue still remains, just in another location:

File "/home/rstudio/flownet2/scripts/run-flownet-many.py", line 22, in <module>
    if(not os.path.exists(args.deployproto)): raise BaseException('deploy-proto does not exist: '+args.deployproto)
BaseException: deploy-proto does not exist: /home/rstudio/flownet2/Flownet2/FlowNet2_deploy.prototxt.template

Again, the file does exist:

rstudio@demo3:~/flownet2/FlowNet2$ ls -al
total 638820
drwxr-xr-x  2 rstudio rstudio      4096 Apr  5 14:05 .
drwxrwxr-x 33 rstudio rstudio      4096 Apr  5 14:05 ..
-rw-r--r--  1 rstudio rstudio     62798 Apr 25  2017 FlowNet2_deploy.prototxt.template
-rw-rw-r--  1 rstudio rstudio     69448 Apr  5 14:05 FlowNet2_train.prototxt
-rw-r--r--  1 rstudio rstudio     69448 Apr 25  2017 FlowNet2_train.prototxt.template
-rw-r--r--  1 rstudio rstudio 653868648 Apr 25  2017 FlowNet2_weights.caffemodel.h5
nikolausmayer commented 6 years ago

Your issue sounds a lot like https://github.com/lmb-freiburg/flownet2/issues/116, but I don't know what causes this problem. Perhaps os.path.exists does not play well with some file system setups, but I've never encountered that problem and I use networked storage and symlinks all the time.

What happens if you just disable these checks? They aren't technically necessary...

agilebean commented 6 years ago

Thanks @nikolausmayer, I disabled the check for the file. Thanks for your hint to networked storage - maybe I should mention I work on a Google Cloud vm. It has shown normal ubuntu behavior so far but maybe the os.path.exists is the root cause of this problem. Can you check this?

However, the script complains just in the next location args.deployproto is called:

Starting optical flow computation...
Processing tuple: ['./zoo1/frame_00001.ppm', './zoo1/frame_00002.ppm', './zoo1/flow_default/forward_1_2.flo']
Traceback (most recent call last):
  File "/home/rstudio/flownet2/scripts/run-flownet-many.py", line 67, in <module>
    proto = open(args.deployproto).readlines()
IOError: [Errno 2] No such file or directory: '/home/rstudio/flownet2/Flownet2/FlowNet2_deploy.prototxt.template'

The script at this location does:

        proto = open(args.deployproto).readlines()
        for line in proto:
            for key, value in vars.items():
                tag = "$%s$" % key
                line = line.replace(tag, str(value))

            tmp.write(line)

So it really seems to need the input from the template - can you please investigate how args.deployproto is set?

nikolausmayer commented 6 years ago

These parameters are just read directly from the command line. I have no experience with cloud storage and no access to Google's cloud, so I cannot test this setup. It looks like there is a Python module for Google Cloud Storage, could you try using that?

agilebean commented 6 years ago

Using another storage would be overkill as I have a standard Ubuntu 16.04 that comes along with 30GB file storage. Everything else works fine. Even - and that's the point - the caffemodel .h5 files in the same folders are read! So my only suspicion is that this is linked to the calling script of fast-artistic-videos which I reference here. As this is a cross-issue between two repositories, I really hope you guys could talk together - aren't you from the same lab?

nikolausmayer commented 6 years ago

We don't use any exotic methods, it's all standard Python, but it can fail if you have an exotic setup.

os.path.exists calls os.stat which may be restricted even if a file could actually be read.

What do these commands print?

@manuelruder what do you think?

agilebean commented 6 years ago

Here's the output - I guess you won't like it:

rstudio@demo3:~$ python -c "import os; print os.path.exists('/home/rstudio/flownet2/Flownet2/FlowNet2_deploy.prototxt.template')"
False
rstudio@demo3:~$ python -c "import os; print os.path.exists('/home/rstudio/flownet2/Flownet2/FlowNet2_weights.caffemodel.h5')"
False
rstudio@demo3:~$ python -c "import os; print os.stat('/home/rstudio/flownet2/Flownet2/FlowNet2_deploy.prototxt.template')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
OSError: [Errno 2] No such file or directory: '/home/rstudio/flownet2/Flownet2/FlowNet2_deploy.prototxt.template'
rstudio@demo3:~$ python -c "import os; print os.stat('/home/rstudio/flownet2/Flownet2/FlowNet2_weights.caffemodel.h5')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
OSError: [Errno 2] No such file or directory: '/home/rstudio/flownet2/Flownet2/FlowNet2_weights.caffemodel.h5'
rstudio@demo3:~$ stat /home/rstudio/flownet2/Flownet2/FlowNet2_deploy.prototxt.template
stat: cannot stat '/home/rstudio/flownet2/Flownet2/FlowNet2_deploy.prototxt.template': No such file or directory
rstudio@demo3:~$ stat /home/rstudio/flownet2/Flownet2/FlowNet2_weights.caffemodel.h5
stat: cannot stat '/home/rstudio/flownet2/Flownet2/FlowNet2_weights.caffemodel.h5': No such file or directory
rstudio@demo3:~$ cd /home/rstudio/flownet2/Flownet2/
/home/rstudio/flownet2/FlowNet2/
rstudio@demo3:~/flownet2/FlowNet2$ ls -al
total 638820
drwxr-xr-x  2 rstudio rstudio      4096 Apr 25  2017 .
drwxrwxr-x 33 rstudio rstudio      4096 Apr  6 07:53 ..
-rw-rw-r--  1 rstudio rstudio     62798 Apr  5 13:58 FlowNet2_deploy.prototxt
-rw-r--r--  1 rstudio rstudio     62798 Apr 25  2017 FlowNet2_deploy.prototxt.template
-rw-rw-r--  1 rstudio rstudio     69448 Apr  5 14:05 FlowNet2_train.prototxt
-rw-r--r--  1 rstudio rstudio     69448 Apr 25  2017 FlowNet2_train.prototxt.template
-rw-r--r--  1 rstudio rstudio 653868648 Apr 25  2017 FlowNet2_weights.caffemodel.h5
nikolausmayer commented 6 years ago

I don't like that you are experiencing such problems—but I think this proves that the problem is not a result of our code (and I'm selfishly relieved to know that).

stat: cannot stat '/home/rstudio/flownet2/Flownet2/FlowNet2_deploy.prototxt.template': No such file or directory To me, this is proof that the issue is indeed with whatever filesystem setup your VM uses. The file cannot be stated, so Python's os.stat and subsequently os.path.exists fail as well.

Unfortunately I have no clue what could cause this. The files belong to your user and have good permissions. My wild guess is that

Apart from hoping that Google's documentation provides more information, I can't really do anything. Please understand that I do not have time to reproduce your setup.

agilebean commented 6 years ago

Fair enough. I will have a new try by reinstalling Flownet again. Seems to be the only way, apart from setting up the VM again... Thanks for your previous test commands, that helped to clarify it's not related to your code.

nikolausmayer commented 6 years ago

Hopefully that will work. If you succeed, please feel free to reply with your experiences (if nothing else, it'll serve as documentation for others)!

nikolausmayer commented 6 years ago

(closed due to inactivity)