tatp22 / linformer-pytorch

My take on a practical implementation of Linformer for Pytorch.
https://arxiv.org/pdf/2006.04768.pdf
MIT License
400 stars 36 forks source link

input seg length #2

Closed xinqipony closed 4 years ago

xinqipony commented 4 years ago

great work! I noticed the linformer input is (batch_size, seq_len, channels), can seq_len be variable length or should the attention be masked if seq_len is padded? why seq_len is a fixed length?

tatp22 commented 4 years ago

Hi! The reason why seq_len is variable length is because the way I implemented it, the learned E and F matrices are nn.Linear layers, which have to know their sequence length beforehand. To that end, however, I created the Padder class, which automatically pads your input sequence if it happens to be of a length other than input_size. To use it in your code, use it like here