Closed Shijie-Liu007 closed 3 years ago
@Shijie-Liu007 我觉得 mel 不需要进行分割吧,以前是一帧对应 256个音频点,变成4band后,一帧对应64个点就可以了,这是我的理解,mel谱还是以前的输入,只不过上采样的幅度变成原来的1/4而已
@Shijie-Liu007 我觉得 mel 不需要进行分割吧,以前是一帧对应 256个音频点,变成4band后,一帧对应64个点就可以了,这是我的理解,mel谱还是以前的输入,只不过上采样的幅度变成原来的1/4而已
非常感谢回复!我先按照您的说法做尝试,再次感谢!
Hey all!
I want to realize multi-band wavernn based on fatchord version. During the training process, I split the audio samples to 4 subbands by using an analysis filter, but how to split mel-spectrum so that it corresponds to audio subbands? Can I divide it into four parts in direct order? Opinions and ideas about multi-band wavernn would be greatly appreciated!
Reference: DurIAN: Duration Informed Attention Network For Multimodal Synthesis https://arxiv.org/abs/1909.01700#:~:text=The%20proposed%20Multiband%20WaveRNN%20effectively%20reduces%20the%20total,end-to-end%20systems%2C%20while%20at%20the%20same%20time%20