Porting to Mac and improve performance

SevaSk / ecoute

Ecoute is a live transcription tool that provides real-time transcripts for both the user's microphone input (You) and the user's speakers output (Speaker) in a textbox. It also generates a suggested response using OpenAI's GPT-3.5 for the user to say based on the live transcription of the conversation.

https://github.com/SevaSk/ecoute

MIT License

5.85k stars 817 forks source link

Porting to Mac and improve performance #27

Closed oldsongsz closed 1 year ago

oldsongsz commented 1 year ago

This is a really good project. I'm learning from it.

I just did some dirty hard-code and make it work on my Macbook Air (Intel model). It runs pretty slow.

Could you share any solution to optimize? (Yes, I"m asking Google and ChatGPT at the same time)

Thanks!

p.s. I'm trying to on an ARM Linux SBC as well, hope it works.

SevaSk commented 1 year ago

Thanks! If the transcription is slow, most likely culprit would be if this logging message: print(f"[INFO] Whisper using GPU: " + str(torch.cuda.is_available()))

is false. Then Whisper cannot transcript with the GPU and will use the CPU instead, which is way slower.

Outside of using whisper, none of the code should run noticeably slow on any OS or computer.

oldsongsz commented 1 year ago

Agree.

torch.cuda.is_available() is only true with Nvida CUDA system, any AMD, Intel or other embedded GPU would not by default support it.

Would you consider integrate Whisper online API to not rely on local resource? Consider LangChain?

SevaSk commented 1 year ago

It would be nice to have an option to use a API for whisper transcriptions. Should be fairly straight forwards since we can use this https://platform.openai.com/docs/guides/speech-to-text/speech-to-text-beta. I will create a specific issue for this.

Klaudioz commented 1 year ago

I've been using whisper with a M2 with good results using this: https://github.com/openai/whisper/pull/382

SevaSk commented 1 year ago

You can check out this branch https://github.com/SevaSk/ecoute/tree/29-add-option-to-use-speech-to-text-api-rather-than-transcribing-locally

use the command

python main.py --api

it honestly way faster and better then the local model

edit: merged that branch to main now.

oldsongsz commented 1 year ago

Thanks for sharing and the work, on my way to try.