Open czy97 opened 5 years ago
@czy97 Before applying the perm action
there is a torch.reshape()
. This changes each batch from (N,M,sequence_length,features)
to (N*M,sequence_length,features)
so that speaker utterances are the one after another (e.g. U11,U12,U13,..,U1M,U21,U22,...,U2M,UN1,UN2,...,UNM)
. Perm action
performs shuffling but in a way that is reversible.
perm is unnecessary in training step
Hi, sir. I don't think that the perm and unperm acitions in your code make any difference. Because the perm action is along the batch dim, in the forward process, the different data along the batch dim are unrelated. Or I misunderstood your purpose?