torch / torch7

http://torch.ch
Other
8.97k stars 2.38k forks source link

Models that Used to Work now Reports nil Value for Evaluate Error #1141

Open 3DTOPO opened 6 years ago

3DTOPO commented 6 years ago

I am using fast-neural-style with torch7. Models that used to work no longer work for me (I had to reinstall Torch7 on a new system). If I run evaluate() on models that have not been changed and used to work, I now get the error: attempt to call method 'evaluate' (a nil value).

Since this model used to work and it has not been changed, it seems to me that it must be an issue introduced in the newer version of torch7 or one of its dependencies than I was running before.

To reproduce this issue: (1) install nn if not installed (2) download fast-neural-style: git clone https://github.com/jcjohnson/fast-neural-style.git ~/fast-neural-style (3) download test model (requires curl): cd fast-neural-style; curl http://3dtopo.com/candy.t7.tgz > models/instance_norm/candy.t7.tgz; tar zxvf models/instance_norm/candy.t7.tgz (4) cd fast-neural-style and start torch with th (5) load a test model and run evaluate(): th> require 'nn' th> require 'fast_neural_style.ShaveImage' th> require 'fast_neural_style.TotalVariation' th> require 'fast_neural_style.InstanceNormalization' th> model=torch.load("models/instance_norm/candy.t7") th> model:evaluate() [string "_RESULT={model:evaluate()}"]:1: attempt to call method 'evaluate' (a nil value) stack traceback: [string "_RESULT={model:evaluate()}"]:1: in main chunk [C]: in function 'xpcall' /home/jeshua/torch/install/share/lua/5.1/trepl/init.lua:661: in function 'repl' ...shua/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk [C]: at 0x00405d50

(6) note that evaluate works with a model defined in a session: th> net = nn.Sequential() th> net:evaluate() No Errors

3DTOPO commented 6 years ago

I tried using an AWS image configured for Torch and everything worked as expected. So I archived the torch directory and copied to my machine then ran the install script, and it works as expected on my machine now.

This leads me to believe that the current repository of Torch is not compatible with fast-neural-style. Its unfortunate that there is no versioning for Torch so one could install a version known to be compatible.

It still gets the same error as my post above however, so that was a red herring for me.

tastyminerals commented 6 years ago

Module:evaluate function is still there: https://github.com/torch/nn/blob/master/Module.lua

You should retrain fast-neural-style model with newer Torch version.

3DTOPO commented 6 years ago

Module:evaluate function is still there:

Like I said, that was a red herring; apparently I did not know what I was doing with that test. My apologies.

You should retrain fast-neural-style model with newer Torch version.

I haven't tried that yet, so I don't even know if training will now work. However, I have hundreds of models that took thousands of hours of compute time. I would much prefer getting the current version of torch backward compatible with the older models.

Any idea what might have changed?

tastyminerals commented 6 years ago

fast-neural-style classes should work with current nn

ckp = torch.load("models/instance_norm/candy.t7")
ckp.model:evaluate()

After you load a model, always print it, to see its structure. You are calling a table which of course does not have evaluate() method but the table contains your model which has.

3DTOPO commented 6 years ago

fast-neural-style classes should work with current nn

Correct. That does work.

After you load a model, always print it, to see its structure. You are calling a table which of course does not have evaluate() method but the table contains your model which has.

Thanks for the tip. I was trying to trouble shoot fast-neural-style, and I had found that if I commented out the evaluate() line in the fast-neural-code I would get the same unexpected results, so I figured it must not be doing anything. Then in my code above to reproduce the issue, I neglected to use the model property of the checkpoint, however, fast-neural-style does correctly use the model property of the checkpoint:

local model = checkpoint.model model:evaluate()

So, I don't have any idea why older versions of Torch work but the current version does not.

tastyminerals commented 6 years ago

evaluate() method disables backpropagation. It is useful when you only do model testing or validation during training.