Closed Haurrus closed 3 months ago
The recent updates to the text-to-speech (TTS) API server aim to enhance flexibility and functionality. These changes include the introduction of optional parameters to improve file handling and the expansion of audio processing capabilities. Additionally, the server's operating port has been updated to reflect a new endpoint. Overall, these modifications aim to provide a more versatile and efficient TTS service.
Files | Change Summary |
---|---|
server.py |
- Added typing.Optional import- Modified request classes to include optional save_path and speaker_wav parameters- Updated running port from 8002 to 8020 - Updated endpoint to use TTSStreamRequest object |
tts_funcs.py |
- Imported torchaudio.transforms - Updated switch_model to assign model_name - Added create_latents_for_all method- Enhanced audio processing and saving in stream_generation and local_generation - Improved file naming in process_tts_to_file |
ππ°β¨
In the land of code and byte,
Changes made both day and night.
From port to path, we hop and play,
Celebrating TTS in a novel way! ππΆ
ππ°β¨
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Server.py :
Modified the class section to add optional argument like path to output the audio for the endpoint tss_to_audio speaker_wav is also optional because I added failover mechanism into the tss_funcs.py to failover to reference.wav from finetuned models. So there's this new import : from typing import Optional And created a TTSStreamRequest class because this endpoint was the only one without class.
Also modified this line to match the default one in main.py : MODEL_FOLDER = os.getenv('MODEL', 'xtts_models')
tts_funcs.py :
added this import at the start : import torchaudio.transforms as T I use it to format the audio output to a more standardly used encoding so it can be used widly without degradation So the def local_generation have been widly modified because of this (maybe I should modify the stream_generation in the same manner)
in def process_tts_to_file : added a failover mechanism to use the reference.wav from the loaded model if it exist.
Summary by CodeRabbit