[Help]: Questions about FACodec's Parameter

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

https://openhlt.github.io/amphion/

MIT License

4.5k stars 386 forks source link

[Help]: Questions about FACodec's Parameter #194

Closed Pydataman closed 2 months ago

Pydataman commented 5 months ago

if self.use_gr_x_timbre: self.x_timbre_predictor = nn.Sequential( GradientReversal(alpha=1), CNNLSTM(in_channels, 245200, 1, global_pred=True), )

Why is there a parameter 245200 in the codes?

HeCheng0625 commented 5 months ago

This part of the parameters is used to predict the speaker id during the training process and is not used during inference. Please ignore it.

Pydataman commented 5 months ago

This part of the parameters is used to predict the speaker id during the training process and is not used during inference. Please ignore it.

@HeCheng0625 that means there were 245200 speakers during training？

jiaqili3 commented 2 months ago

Hi, we're closing this thread since the question about number of speakers in FACodec training is not directly related to this repo, for more detailed training we recommend looking at the latest FACodec reproduce in latest Amphion release (currently in pr). Thanks!