The output sequence length in the implementation is the same as the input sequence length. Perhaps I have misunderstood, but is this a required feature? In the BaBi task for example, in the target you simply replace the "-" symbols with the correct word and leave all other words the same, thereby leaving the sequence length identical.
How could one predict an output sequence of different length to the input (many-to-many), or just 1 output timestep (many-to-one)? For the latter case, would you just take the last timestep in the output?
Hello,
The output sequence length in the implementation is the same as the input sequence length. Perhaps I have misunderstood, but is this a required feature? In the BaBi task for example, in the target you simply replace the "-" symbols with the correct word and leave all other words the same, thereby leaving the sequence length identical.
How could one predict an output sequence of different length to the input (many-to-many), or just 1 output timestep (many-to-one)? For the latter case, would you just take the last timestep in the output?
Thanks!