lucidrains / routing-transformer

Fully featured implementation of Routing Transformer
MIT License
282 stars 29 forks source link

Issue about input shape #19

Closed Henrykwokkk closed 3 years ago

Henrykwokkk commented 3 years ago

I want to employ this model in a translation task. Do I need to set each input sequence to the same length through padding outside the model? For example, if I set themax_seq_len as 2048 and the batch_size is 1 for the model, but the length of each sequence is different, the size of the sequence should be (1, sequence length). Do I need to set a rnn_utils.pad_sequence to make the sequence length equal to 2048. It is because I see your sample example are all of equal length. Thanks!

lucidrains commented 3 years ago

@guohanyang1994 hey Guo! the encoder and decoder can have different max_seq_len and you don't need to worry about padding to anything besides the longest sequence in your batch. the framework takes care of the rest of the padding and modifying the mask so that it works