Open julyanghar opened 5 days ago
According to the original paper, the 367th line in TDANet seems to be changed as follow:
# expanded = self.last_layer[i](x_fused[i], x_fused[i - 1]) expanded = self.last_layer[i](x_fused[i], x_fused[i + 1])
Because the first embedding in the decoder should be produced using the top global feature and the upsampled one by factor 2.
Yes, you are right. In the tdanet v2, we have changed this code. https://github.com/JusperLee/TDANet/blob/565af18692e18bf695e5bb0ca54ba466c4a86a2a/look2hear/models/TDANet-v2.py#L377
Good to know!
According to the original paper, the 367th line in TDANet seems to be changed as follow:
Because the first embedding in the decoder should be produced using the top global feature and the upsampled one by factor 2.