WebUI for new voices - Githubissues

Helvio88 commented 1 month ago

I've done some very basic work on a voice recording/playback/inference WebUI that could be an interesting addition to openedai-speech. I would like to know if you'd consider an interactive WebUI for this project or if such development would simply interface with openedai-speech to maintain its focus.

My goals are currently:

Any audio format upload;
In Browser recording;
Inference for testing;
Noise / Length detection and thresholds;
???
Profit!

Since the In Browser recording part would require some npm libraries, a frontend section in this project would have to be created, therefore I completely understand if in your opinion this should be kept separate.

Anyway, I think we should debate and find the most suitable place for something like this!

thiswillbeyourgithub commented 1 month ago

Not the owner but : wouldn't be way simpler to use gradio for example? You could get a nice ui working in minutes, and a complex ui in hours

matatonic commented 1 month ago

I'm not normally a big fan of gradio, but I agree, I think it's far better suited for a simple UI than adding npm dependencies.

thiswillbeyourgithub commented 1 month ago

Also might be relevant. The faster whisper server repo has a gradui UI : https://github.com/fedirz/faster-whisper-server

Helvio88 commented 1 month ago

That changes my project's course a little bit, but I'm all in for a new challenge! I'll see what I can initially contribute with Gradio, and perhaps we can expand from there :)

Helvio88 commented 1 month ago

Referencing PR #65 on this issue.

matatonic commented 1 month ago

Have not forgotten!

matatonic commented 3 days ago

Sorry, I still haven't found time for updating and releasing this, but I may again soon.

matatonic / openedai-speech

WebUI for new voices #63