erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
1.14k stars 118 forks source link

[BUG/Feat]: Voices support mp3's but it only lists wav files #380

Closed phazei closed 1 month ago

phazei commented 1 month ago

The xtts voices list will only show files in the directory that end in wav.

If I rename a mp3 to wav, then it works just fine.

I presume that's probably due to the ffmpeg dealing with it. It seems like it should just list the mp3s as well.

Are there known steps to reproduce?

Add a mp3 to the voices dir.

rename it to wav without even converted, watch it work.

erew123 commented 1 month ago

Hi @phazei

This is due to updates in the Coqui engine. Im not actually sure how it internally handles them (raw files or ffmpeg), its something Ive not had time to code around at the moment, though will do in future.

Thanks