lmjohns3 / theanets

Neural network toolkit for Python
http://theanets.rtfd.org
MIT License
328 stars 74 forks source link

Error in build_graph (The updates parameter must be an OrderedDict/dict) #95

Closed qyouurcs closed 8 years ago

qyouurcs commented 8 years ago

Hello all,

I create a new layer on top of LSTM. The return “update" from the scan function in the transform layer is of the type:

(Pdb) updates OrderedUpdates([(<CudaNdarrayType(float32, vector)>, Subtensor{int64}.0), (<CudaNdarrayType(float32, vector)>, Subtensor{int64}.0)])

Then, in the build_graph function in graph.py: we have the following code to update the "updates" variable in all the layers:

475            for layer in self.layers:
476                out, upd = layer.connect(outputs, noise, dropout)
477                outputs.update(out)
478                updates.extend(upd)

"updates" is initially declared as a list variable and here upd is OrderedUpdates. Thus, this updates extend will only keep the keys of upd, which leads to the error in compiling the function in the downhill base. The following are some error messages from the exception stack.

  File "...downhill/downhill/base.py", line 118, in _compile
    name='evaluation')
  File "...Theano/theano/compile/function.py", line 300, in function
    output_keys=output_keys)
  File "...Theano/theano/compile/pfunc.py", line 443, in pfunc
    "The updates parameter must be an OrderedDict/dict or a list of "
ValueError: The updates parameter must be an OrderedDict/dict or a list of lists/tuples with 2 elements

I also checked the original LSTM, there is no such error, which is due to the fact that updates is empty. Thus, updates.extend(upd) will do nothing to updates.

It is kind of easy to solve this problem. We only need to change updates.extend(upd) --> updates.extend(upd.items()).

Does anyone have the similar issue?

lmjohns3 commented 8 years ago

In your layer subclass, can you just return updates.items() ?

qyouurcs commented 8 years ago

Yes, that's exactly what I am doing to prevent the error. By the way, I am using theanets 0.6.

lmjohns3 commented 8 years ago

Ok, it seems like that's the correct solution. The transform method is supposed to return a dictionary of layer outputs, and a list (not a dict) of updates to apply.