Open f0k opened 8 years ago
The example also seems to run with optimizer='None'
or optimizer='fast_compile'
, so that is probably an issue in an optimization.
I tried with DebugMode, but unfortunately it crashes before it can tell me which optimization is to blame.
This may have been fixed by #5775. @Thrandis will have a look.
@lamblin @f0k The problem is still here, even with the last version of Theano. I'm gonna investigate this more in details!
As @lamblin said before, it works with optimizer=None
and optimizer=fast_compile
. The test values are also fine.
The faulty optimization is PushOutScanOutput
. I'll work on a fix. In the meantime, you can use the following flag to disable it: THEANO_FLAGS=optimizer_excluding=scanOp_pushout_output
@slefrancois as said previously, the faulty optimization is PushOutScanOutput
. I wrote a fix, but this fix breaks scanOp_save_mem
.
It seems that PushOutScanOutput
is also the cause of other issues:
https://github.com/Theano/Theano/issues/5994
https://github.com/Theano/Theano/issues/5249
So I think this op needs some upgrade in general!
https://github.com/Theano/Theano/issues/5994 isn't the same problem.
https://github.com/Theano/Theano/issues/5249 maybe is the same problem, but I don't know for sure if it use truncated gradient. To know if it is the same problem, can you check if the code in this issue have the inputs of the inner function the same at each iteration?
@Thrandis can you test if the code in this issue work without the truncated gradient? If so, would it make sense to disable this optimization for grad op when truncated gradient is used? Or can we repair that optimization in that case?
can you test if the code in this issue work without the truncated gradient?
As indicated in the original post, the error disappears when setting truncate_gradient=-1
.
A Lasagne user found a bug related to the gradient of theano.scan: https://groups.google.com/d/msg/lasagne-users/Dzkp4szn0Vk/3S-pYDIsBAAJ
I've simplified his example to use pure Theano:
When running this (no matter if on CPU or GPU), I get:
The error disappears with either of the following changes (indicated in the code above):
truncate_gradient=-1
w_inhid
into the scan loop instead of performing it outsidew_inhid
instead ofw_hidhid
(well, this is kind of obvious, since it's outside scan)