k2-fsa / snowfall

Moved to https://github.com/k2-fsa/icefall
Apache License 2.0
143 stars 42 forks source link

Bug in computing encoder padding mask #240

Open csukuangfj opened 2 years ago

csukuangfj commented 2 years ago

It happens only when --concatenate-cuts=True.

See the problematic code below (line 692): https://github.com/k2-fsa/snowfall/blob/350253144af04c295f560cdb976f817dc13b2993/snowfall/models/transformer.py#L687-L692

When --concatenate-cuts=True, several utterances may be concatenated into one sequence. So lengths[sequence_idx] may correspond to multiple utterances. Later utterances will OVERWRITE the value of lengths[sequence_idx] set by earlier utterances if the sequence with sequence_id contains at least two utterances.

csukuangfj commented 2 years ago

I found this bug while writing tests for encoder_padding_mask. Liyong and I disabled --concatenate-cuts during training, so it is not a problem for us.