Closed yusuke-ai closed 3 weeks ago
Hi,
Thank you for the awesome work!
I'm reading your paper and the code. And maybe it has some inconsistency? The paper says
T-V encoder contains a few residual convolution blocks, but we employ Layer Normalization (LN) instead of IN to preserve temporal relationships in each instance
but the code below doesn't contain such code. https://github.com/winddori2002/DEX-TTS/blob/main/DEX-TTS/model/ref_encoder.py#L131
Should I add layer normalization to the code or is it ok to leave it without LN?
Thank you!
Hi, thanks for your interest.
For TV encoder, replacing BN in the TVEncoderBlock with LN worked better. You can check the TVEncoderBlock and BasicConv class.
Thank you for the reply! OK. I will check.
Hi,
Thank you for the awesome work!
I'm reading your paper and the code. And maybe it has some inconsistency? The paper says
but the code below doesn't contain such code. https://github.com/winddori2002/DEX-TTS/blob/main/DEX-TTS/model/ref_encoder.py#L131
Should I add layer normalization to the code or is it ok to leave it without LN?
Thank you!