open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
https://openhlt.github.io/amphion/
MIT License
4.28k stars 364 forks source link

[Help]: how to output 48k hz sample rate audio in svc #230

Closed ILG2021 closed 1 week ago

ILG2021 commented 2 weeks ago

The default output is very low quality, only 24k hz, can not be used in production. Is there anyway to improve this?

ILG2021 commented 1 week ago

Anyone can gives help, I am using MultipleContentsSVC.

Adorable-Qin commented 1 week ago

Hi @ILG2021 !

If you want to output audio at a sample rate of 48kHz, follow these steps:

  1. Modify your config file, e.g. `sample_rate' with the appropriate content.
  2. use a 48khz vocoder in the inference stage, you can use any pre-trained model shared by others or use a vocoder recipe to train one by yourself.

Feel free to contact me if you have any further questions.

ILG2021 commented 1 week ago

Thank you for your reply. I change the configs like:

  1. egs/svc/MultipleContentsSVC/exp_config.json, change sample_rate to 48000 in "preprocess" field.
  2. pretrained/bigvgan/args.json, change sample_rate to 48000 in "preprocess" field. I am using Amphion Singing BigVGAN

am I right? should I change other things?

Adorable-Qin commented 1 week ago

Thank you for your reply. I change the configs like:

  1. egs/svc/MultipleContentsSVC/exp_config.json, change sample_rate to 48000 in "preprocess" field.
  2. pretrained/bigvgan/args.json, change sample_rate to 48000 in "preprocess" field. I am using Amphion Singing BigVGAN

am I right? should I change other things?

Please note that the pretrained BigVGAN provided is trained under a 24k sampling rate. As a result, it cannot be used directly by changing the sample_rate in the args.json. Please be advised that you will need to locate an alternative available checkpoint from the Internet or train your vocoder using 48k data.

ILG2021 commented 1 week ago

Ok, I will try. Another problem, Amphion svc needs preprocess data to inference. Can it be improved?

Adorable-Qin commented 1 week ago

Ok, I will try. Another problem, Amphion svc needs preprocess data to inference. Can it be improved?

Yes, we are currently developing an on-the-fly extraction version, which will be made available in the near future.

ILG2021 commented 1 week ago

How could I use the nsfhifigan vocoder? https://github.com/openvpi/vocoders/releases If I create a folder, put the checkpoint file and move egs/vocoder/gan/nsfhifigan/exp_config.json to the folder and rename it to args.json, will it work?