farizrahman4u / recurrentshop

Framework for building complex recurrent neural networks with Keras
MIT License
767 stars 218 forks source link

Bug in handling of initial readout input #86

Open djpetti opened 6 years ago

djpetti commented 6 years ago

Following the instructions for readout in the docs folder, I was able to successfully create a network that employed readout. So far, so good.

The problems arose when I started trying to specify an initial value for the readout. My setup involved creating a separate, dedicated keras Input layer, and passing it as the initial_readout parameter when I instantiated my RecurrentModel. However, when I tried to use that model, I was consistently ending up with the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "quadrotor/controllers/simple_rnn.py", line 19, in __init__
    self.__make_graph()
  File "quadrotor/controllers/simple_rnn.py", line 68, in __make_graph
    self.__model = Model(inputs=self.__goal_input, outputs=seq_output)
  File "build/bdist.linux-x86_64/egg/keras/legacy/interfaces.py", line 87, in wrapper
  File "build/bdist.linux-x86_64/egg/keras/engine/topology.py", line 1717, in __init__
  File "build/bdist.linux-x86_64/egg/keras/engine/topology.py", line 1707, in build_map_of_graph
  File "build/bdist.linux-x86_64/egg/keras/engine/topology.py", line 1678, in build_map_of_graph
AttributeError: 'Tensor' object has no attribute '_keras_history'

Digging a little deeper, I zeroed in on this particular bit of code from engine.py:

        shapes = []
        for x in model_input:
            if hasattr(x, '_keras_shape'):
                shapes.append(x._keras_shape)
                del x._keras_shape  # Else keras internals will get messed up.
        model_output = _to_list(self.model.call(model_input))
        for x, s in zip(model_input, shapes):
            setattr(x, '_keras_shape', s)

I can see that what this is supposed to do is save the _keras_shape parameter of input layers, remove it, and then add it back in later. However, in my case, I had one input that was a Keras Tensor, and some others that weren't. This is where this code has an issue, because if not every input has a _keras_shape parameter, when it goes and adds the parameter back, it will not add the correct shapes to the correct layers. This is exactly what was happening for me. Consequently, my input layer was never having its _keras_shape parameter reset, and Keras wasn't pleased about that.

I've fixed this issue locally, and it works like a charm. I'm happy to submit a PR, but I wanted to do this first just to make sure that I'm not just using the library in the completely wrong way or something. After all, the documentation for this is somewhat... sparse.