farizrahman4u / seq2seq

Sequence to Sequence Learning with Keras
GNU General Public License v2.0
3.17k stars 845 forks source link

Seq2seq multiple input features (Passing multiple word/word tokens as input) #205

Open iamsiva11 opened 7 years ago

iamsiva11 commented 7 years ago

Is there a way to pass extra feature tokens along with the existing word token (training features/source file vocabulary) and feed it to the encoder RNN of seq2seq?. Since, it currently accepts only one word token from the sentence at a time.

Let me put this in a more concrete fashion; Consider the example of machine translation/nmt - say I have 2 more feature columns for the corresponding source vocabulary set( Feature1 here ). For example, consider this below:

+---------+----------+----------+ |Feature1 | Feature2 | Feature3 | +---------+----------+----------+ |word1 | x | a | |word2 | y | b | |word3 | y | c | +---------+----------+----------+ To summarise, currently seq2seq dataset is the parallel data corpora has a one-to one mapping between he source feature(vocabulary,i.e Feature1 alone) and the target(label/vocabulary). I'm looking for a way to map more than one feature(i.e Feature1, Feature2,Feature3) to the target(label/vocabulary).

Moreover, I believe this is glossed over in the seq2seq-pytorch tutorial(https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb) as quoted below:

When using a single RNN, there is a one-to-one relationship between inputs and outputs. We would quickly run into problems with different sequence orders and lengths that are common during translation…….With the seq2seq model, by encoding many inputs into one vector, and decoding from one vector into many outputs, we are freed from the constraints of sequence order and length. The encoded sequence is represented by a single vector, a single point in some N dimensional space of sequences. In an ideal case, this point can be considered the "meaning" of the sequence. Furthermore, I tried tensorflow and took me a lot of time to debug and make appropriate changes and got nowhere. And heard from my colleagues that pytorch would have the flexibility to do so and would be worth checking out.

Please share your thoughts on how to achieve the same. Would be great of anyone tells how to practically implement/get this done. Thanks in advance