ufal / neuralmonkey

An open-source tool for sequence learning in NLP built on TensorFlow.
BSD 3-Clause "New" or "Revised" License
411 stars 102 forks source link

multilayer encoding (decoding) #605

Open varisd opened 6 years ago

varisd commented 6 years ago

Here's an idea that has been on my mind for some time:

Currently NMonkey supports encoding-decoding of sequences of tokens [batch_size, max_seq_len], which is then represented in embedding space as [batch_size, max_seq_len, emb_size]. Let's call these 1D sequences.

However it would be nice, to be able to represent "multidimensional" sequences.

Example: I want to encode sentence where each word embedding is created from its embedded characters using separate encoder (the embedding of the word is a final state of the encoder output).

The input in this case is a matrix of size [batch_size, max_seq_len, max_word_len] (max_seq_len is the length of sentence measured in number of words). We than create a representation in two steps:

  1. (using char emb. matrix) [batch_size, max max_seq_len, max_word_len, char_emb_size]
  2. (using encoder) [batch_size, max_seq_len, word_emb_size] The input sentences are represented as strings, we define the word/character-level boundaries by specifying separators (in this case " " for words and empty string for character-level

Let's call this a 2D sequence. This can be of course expanded to nD sequences.

Technically, this can be expanded also to decoding (first we decode word representation and then decode sequences of characters based on these representations). However, I think this should be left as a separate issue.

What I would like to discuss here is a reasonable approach towards implementing this. My ideas so far:

I am currently sure whether to perform the "recursive" encoding in the sequence or rather define a decoder class for this. I am also aware that some changes to the sentence reader will be necessary.

Before I will start working on this I would like to have a discussion on the topic to get some insights. Feel free to contribute with any ideas.

jlibovicky commented 6 years ago

I think it is possible with the current implementation. Recurrent encoder can have multiple input sequences (called factors in out code) which should be sequences of the same length. They are concatenated and fed into an RNN. Is that what you need? Factored decoding is not implemented,

If you implement the character-level representation layer inheriting TemporalStateful and as an additional factor in the encoder. Recently, I did something similar for a different project, so I can help you with that.