jihunchoi / recurrent-batch-normalization-pytorch

PyTorch implementation of recurrent batch normalization
243 stars 34 forks source link

Why do you NOT use packed padding? #14

Open brando90 opened 5 years ago

brando90 commented 5 years ago

Why do you NOT use packed padding but instead used masks?

jihunchoi commented 5 years ago

Hi, as per my understanding, the most common usage of the packed sequence is to give it as input to pre-defined RNN modules (e.g. torch.nn.LSTM, torch.nn.GRU, ...). However the batch-normalized RNN requires modification of the computation of recurrent components, thus I thought that there's no advantage of using packed sequences instead of masks. Also, at the time of implementation, pack_padded_sequence couldn't accept unsorted sequences (i.e. the input must be sorted by lengths), which was thought to introduce another complication.