saharmor / whisper-playground

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/
MIT License
777 stars 140 forks source link

Implemented Real-time Audio Transcription with Speaker Diarization #23

Closed ethanzrd closed 1 year ago

ethanzrd commented 1 year ago

Changelog:

Client Modifications:

Server Modifications:

Future modifications:

Credits:

Color Your Captions: Streamlining Live Transcriptions With “diart” and OpenAI’s Whisper - This allowed for the well thought-out implementation of Diart and Whisper I've used here. Thanks to Juanma Coria (the creator of Diart)!

ethanzrd commented 1 year ago

Known bugs:

Edit: The latest commit enhancing responsiveness should fix the first problem. If the problem is indeed caused due to socket timeouts, this should solve it. It should also pave the way to multi-client support which will be explored on July 28th.

saharmor commented 1 year ago

It doesn't have to be through Conda. You can also install via other means, e.g. pip install where appropriate).

It's somewhat of an overkill to install Conda for this purpose alone (weight 3GB and installs many things).

Sent via Superhuman ( @.*** )

On Fri, Aug 04, 2023 at 11:12:06, Ethan Zerad < @.*** > wrote:

@.**** commented on this pull request.

In install_playground. sh ( https://github.com/saharmor/whisper-playground/pull/23#discussion_r1284715363 ) :

\ No newline at end of file +cd ../backend

That's what Diart requires:

conda install portaudio pysoundfile ffmpeg -c conda-forge

— Reply to this email directly, view it on GitHub ( https://github.com/saharmor/whisper-playground/pull/23#discussion_r1284715363 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/ABPE22IJKWVBMZ7JIZHHHCLXTU3PNANCNFSM6AAAAAA2YKT4RA ). You are receiving this because you commented. Message ID: <saharmor/whisper-playground/pull/23/review/1563350063 @ github. com>