-
Hello,
As the best ASR models now mostly use some pretrained large models as input feature and are further finetuned on some specific domain, it will be very convenient to have the large model feat…
-
>python train_finetune.py --config_path ./Configs/config_ft.yml
Some weights of the model checkpoint at microsoft/wavlm-base-plus were not used when initializing WavLMModel: ['encoder.pos_conv_embed.…
-
# Problems in reproducing the same numbers with default downstream configs.
Are the default downstream configs are different than the one used in SUPERB?
I tried to access it via https://github.co…
-
I want to train the Chinese model. Do you support mixed input in Chinese and English?
-
Hi thanks for this amazing TTS system, the inference is the best quality open source system that I have heard and works well and very fast under windows. However the fine tune script does not appear …
-
### System Info
**Relevant Libraries**
transformers==4.26.1
torchaudio==2.0.2
torch==2.0.1
OS: Ubuntu 20.04
### Who can help?
@sanchit-gandhi
### Information
- [ ] The official e…
-
Since I found those issues with the 24khz preprocess I figured I might as well run a 48khz model instead of the 24 since I have to start from the 900k model regardless.
I have already updated the…
-
First, thanks for the paper and the code, this is very interesting!
Did you happen to do any testing with other versions of WavLM, such as Base or Base+? I was wondering if it would be possible to ma…
-
In your paper, you say:
> Recent work confirms that later layers give poorer predictions of pitch, prosody, and speaker identity. Based on these observations, we found that using a layer with high …
-
1.Does SLMGAN use R1 regularization like StarGANv2-VC? If so, how to apply it? If not, should I use any other techniques to stabilize the training of GAN?
2.After epoch 20, the SLM-based discriminato…