matatonic / openedai-speech

An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.
GNU Affero General Public License v3.0
464 stars 58 forks source link

WebUI for new voices #63

Open Helvio88 opened 1 month ago

Helvio88 commented 1 month ago

I've done some very basic work on a voice recording/playback/inference WebUI that could be an interesting addition to openedai-speech. I would like to know if you'd consider an interactive WebUI for this project or if such development would simply interface with openedai-speech to maintain its focus.

My goals are currently:

  1. Any audio format upload;
  2. In Browser recording;
  3. Inference for testing;
  4. Noise / Length detection and thresholds;
  5. ???
  6. Profit!

Since the In Browser recording part would require some npm libraries, a frontend section in this project would have to be created, therefore I completely understand if in your opinion this should be kept separate.

Anyway, I think we should debate and find the most suitable place for something like this!

thiswillbeyourgithub commented 1 month ago

Not the owner but : wouldn't be way simpler to use gradio for example? You could get a nice ui working in minutes, and a complex ui in hours

matatonic commented 1 month ago

I'm not normally a big fan of gradio, but I agree, I think it's far better suited for a simple UI than adding npm dependencies.

thiswillbeyourgithub commented 1 month ago

Also might be relevant. The faster whisper server repo has a gradui UI : https://github.com/fedirz/faster-whisper-server

Helvio88 commented 1 month ago

That changes my project's course a little bit, but I'm all in for a new challenge! I'll see what I can initially contribute with Gradio, and perhaps we can expand from there :)

Helvio88 commented 1 month ago

Referencing PR #65 on this issue.

matatonic commented 1 month ago

Have not forgotten!

matatonic commented 3 days ago

Sorry, I still haven't found time for updating and releasing this, but I may again soon.