Open yuzuda283 opened 4 months ago
https://github.com/facebookresearch/encodec/issues/25 will this issue have some impact on the training of speechtokenizer?
好问题 毕竟蒸馏时编码器输出的feature的梯度无法反传到前面的卷积层等结构
我知道了 第一层码本在训练时也使用了梯度直通估计,所以feature还是记录有前面encoder的计算图的
https://github.com/facebookresearch/encodec/issues/25 will this issue have some impact on the training of speechtokenizer?