farizrahman4u / seq2seq

Sequence to Sequence Learning with Keras
GNU General Public License v2.0
3.17k stars 845 forks source link

intializing keras embeddings with word2vec #90

Open nabihach opened 8 years ago

nabihach commented 8 years ago

Hi @farizrahman4u

In the debug_seq2seq example by @nicolas-ivanov , I want to add a keras embeddings layer before the seq2seq layer, which i want to initialize with the word2vec vectors. But I get a weird error related to out-of-range indices.

My model is:

    seq2seq = Seq2seq(
        batch_input_shape=(SAMPLES_BATCH_SIZE, INPUT_SEQUENCE_LENGTH, TOKEN_REPRESENTATION_SIZE),
        hidden_dim = HIDDEN_LAYER_DIMENSION,
        output_length=ANSWER_MAX_TOKEN_LENGTH,
        output_dim=token_dict_size,
        depth=2,
        dropout=0.25,
        peek=True
    )

    embedding_weights = numpy.zeros((len(token_to_index),TOKEN_REPRESENTATION_SIZE))
    for word,index in token_to_index.items():
        embedding_weights[index,:] = numpy.array(w2v_model[word])

    model.add(Embedding(len(token_to_index), TOKEN_REPRESENTATION_SIZE, input_length = INPUT_SEQUENCE_LENGTH, batch_input_shape=(SAMPLES_BATCH_SIZE, INPUT_SEQUENCE_LENGTH), weights=[embedding_weights], mask_zero=False, init='orthogonal'))

 model.add(seq2seq)

But when I run this, I get the following error:

Traceback (most recent call last):
  File "bin/train.py", line 50, in <module>
    learn()
  File "bin/train.py", line 41, in learn
    nn_model = get_nn_model(len(index_to_token), index_to_token, w2v_model)
  File "/home/nasghar/Desktop/debug_seq2seq/trunk/lib/nn_model/model.py", line 66, in get_nn_model
    model.add(seq2seq)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/models.py", line 307, in add
    output_tensor = layer(self.outputs[0])
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/engine/topology.py", line 511, in __call__
    self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/engine/topology.py", line 569, in add_inbound_node
    Node.create_node(self, inbound_layers, node_indices, tensor_indices)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/engine/topology.py", line 150, in create_node
    output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/models.py", line 345, in call
    return self.model.call(x, mask)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/engine/topology.py", line 2017, in call
    output_tensors, output_masks, output_shapes = self.run_internal_graph(inputs, masks)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/engine/topology.py", line 2159, in run_internal_graph
    output_tensors = to_list(layer.call(computed_tensor, computed_mask))
  File "/var/lib/try_seq2seq/seq2seq/seq2seq/layers/state_transfer_rnn.py", line 76, in call
    input_length=input_shape[1])
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/backend/theano_backend.py", line 827, in rnn
    go_backwards=go_backwards)
  File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan.py", line 745, in scan
    condition, outputs, updates = scan_utils.get_updates_and_outputs(fn(*args))
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/backend/theano_backend.py", line 819, in _step
    output, new_states = step_function(input, states)
  File "/usr/local/lib/python2.7/dist-packages/Keras-1.0.7-py2.7.egg/keras/layers/recurrent.py", line 790, in step
    B_U = states[2]
IndexError: tuple index out of range

What exactly is wrong with my embedding layer? Please help

alex-j-j commented 8 years ago

I have the same problem. The initial states of the LSTMDecoder are not set properly, but I can't find out where this happens, and therefore I don't know why. Did you resolve the issue?

nabihach commented 8 years ago

No I wasn't able to resolve the issue. I remember trying a bunch of different things but nothing worked. I also thought for a while that the initial states weren't set properly, but I couldn't find a fix.

SeanTater commented 8 years ago

I removed parts until the error doesn't present itself and this is the minimum I reached. The following fails:

class S2s(Sequential):
    def __init__(self, output_dim, hidden_dim, output_length, depth=1, broadcast_state=True, inner_broadcast_state=True, peek=False, dropout=0.1, **kwargs):
        super(S2s, self).__init__()
        depth = (depth, depth)
        shape = kwargs['batch_input_shape']
        del kwargs['batch_input_shape']

        self.add(LSTMEncoder(batch_input_shape=shape, output_dim=hidden_dim, state_input=False, return_sequences=depth[0] > 1, **kwargs))
        self.add(LSTMDecoder2(hidden_dim=hidden_dim, output_length=output_length, state_input=broadcast_state, **kwargs))
        self.layers[0].broadcast_state(self.layers[1])
        [self.encoder,self.decoder] = self.layers

but it succeeds, if you comment out broadcast_state().

Hopefully that's helpful.

nabihach commented 8 years ago

Thanks, I'll try this out. I really don't want to comment out broadcast_state() though, it is critical for most models (at least the ones I'm using!).

SeanTater commented 8 years ago

Oh, no I don't suggest commenting it out! I just haven't figured out what about broadcast state is causing the problem and thought it would be helpful to see my partial results anyway. I'll keep working on it, too.

On Thu, Sep 15, 2016, 20:53 Nabiha Asghar notifications@github.com wrote:

Thanks, I'll try this out. I really don't want to comment out broadcast_state() though, it is critical for most models (at least the ones I'm using!).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/farizrahman4u/seq2seq/issues/90#issuecomment-247493979, or mute the thread https://github.com/notifications/unsubscribe-auth/ADf-0tced05NyaLA4fzoAuA3apo58bWYks5qqeijgaJpZM4Jjvjn .

nabihach commented 8 years ago

Thanks! It definitely is helpful :) Keep us posted! I'll try playing around with it too.

sallamander commented 8 years ago

@nabihach Have you tried altering the source code to add the embedding layer right into the Seq2seq class?

nabihach commented 8 years ago

No I never tried that, but that's a good idea. I'll try that and post the results here.