ZhangXInFD / SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
https://0nutation.github.io/SpeechTokenizer.github.io/
Apache License 2.0
466 stars 40 forks source link

zero grad issus in encodec? #10

Open yuzuda283 opened 4 months ago

yuzuda283 commented 4 months ago

https://github.com/facebookresearch/encodec/issues/25 will this issue have some impact on the training of speechtokenizer?

UkiTenzai commented 3 weeks ago

好问题 毕竟蒸馏时编码器输出的feature的梯度无法反传到前面的卷积层等结构

UkiTenzai commented 3 weeks ago

我知道了 第一层码本在训练时也使用了梯度直通估计,所以feature还是记录有前面encoder的计算图的