The current audio quality is relatively poor. I tried using my own voice to generate the voice acting for a game character but found the sampling rate to be quite low.
I found the SAMPLE_RATE = 24000 option in the project and changed it to 48000, but then the output sound became very strange.
So, is there a better way to output higher quality audio?
Also, is it possible for this project to use an XML format audio markup structure to better determine things like tone of voice? (Or how to use other func)
Thank you in advance for your help!
The audio I recorded is at a 48000Hz sampling rate, in WAV format, and under 10 seconds.
this is demo code:
# No input text
from utils.prompt_making import make_prompt
make_prompt(name="hailun", audio_prompt_path="test_input/hailun.wav",
transcript="Watch out for the enemy over there!"
)
# Clone
from utils.generation import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
# download model
preload_models()
text_prompt = """
Report!
The tower ahead has been breached!
"""
audio_array = generate_audio(text_prompt, prompt="hailun")
write_wav("./test_output/hailun_clone.wav", SAMPLE_RATE, audio_array)
The current audio quality is relatively poor. I tried using my own voice to generate the voice acting for a game character but found the sampling rate to be quite low.
I found the SAMPLE_RATE = 24000 option in the project and changed it to 48000, but then the output sound became very strange.
So, is there a better way to output higher quality audio?
Also, is it possible for this project to use an XML format audio markup structure to better determine things like tone of voice? (Or how to use other func)
Thank you in advance for your help!
The audio I recorded is at a 48000Hz sampling rate, in WAV format, and under 10 seconds.
this is demo code: