kaituoxu / Speech-Transformer

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
768 stars 195 forks source link

data.py中的一个问题 #30

Closed xiaozhi1571 closed 4 years ago

xiaozhi1571 commented 4 years ago

您好,在data.py文件中,return里面有个ilen值,按理来说应该得到的是padding之前的xs的长度?但是这里得到的是长度已经一样的?

我用batch_size=4测试了一下:

get batch of lengths of input sequences

ilens = np.array([x.shape[0] for x in xs]) ilens = torch.from_numpy(ilens) # [235, 235, 235, 235] [250, 250, 250, 250]...

好像在前面 batch = load_inputs_and_targets(batch[0], LFR_m=LFR_m, LFR_n=LFR_n) xs, ys = batch 这里xs就已经padding过了?

这里的ilen值的问题会导致后面的enc_dec_attn_mask全都是false...

xiaozhi1571 commented 4 years ago

啊我知道了 因为前面对数据按照Shape进行了Sorted...因此大部分batch里面的shape都是一样的