yehiaabdelm / transcription-app

Frontend for realtime transcription using WhisperLive
4 stars 0 forks source link

is this still working with latest version of WhisperLive #1

Open Momaiz opened 2 months ago

Momaiz commented 2 months ago

Hello Yehia,

I noticed your comment on GitHub regarding the issue with the new version of WhisperLive. However, when I attempted to use the latest version, I encountered some problems. Could you please take a look at it and perhaps we can get in touch to discuss further?

Best regards,

yehiaabdelm commented 2 months ago

Hi Momaiz,

I just tried running both (my frontend and WhisperLive backend) and they work as expected. Here were my steps for reference.

git clone git@github.com:collabora/WhisperLive.git
cd WhisperLive
# for their server requirements I bumped up onnxruntime==1.17.0 since it was giving me issues (just change that line)
pip install -r requirements/server.txt

Make sure you have git lfs for the following step to work: https://stackoverflow.com/questions/67595500/how-to-download-a-model-from-huggingface

Then I downloaded faster whisper tiny.en inside WhisperLive directory. I also changed the model parameter on line 684 in the whisper_live/server.py file to tiny.en.

git clone https://huggingface.co/Systran/faster-whisper-tiny.en

Then I ran their backend

python3 run_server.py --port 9090 --backend faster_whisper --faster_whisper_custom_model_path ./faster-whisper-tiny.en

And for my frontend

git clone git@github.com:yehiaabdelm/transcription-app.git
cd transcription-app
npm i

Then create a .env for the frontend with

PUBLIC_WEBSOCKET_URL="ws://localhost:9090"

And run

npm run dev

And it should work as expected.

FYI, here's how the payload looks from their backend.

{
  "uid": "756c06fe-e96e-44d1-bf4a-74a1a569908a",
  "segments": [
    {
      "start": "0.000",
      "end": "3.000",
      "text": " Hello."
    },
    {
      "start": "3.000",
      "end": "6.000",
      "text": " Hello."
    },
    {
      "start": "6.000",
      "end": "8.000",
      "text": " Hello"
    }
  ]
}

Hope this helps!

Momaiz commented 2 months ago

Hello, Yehiaa! Thank you for your prompt response. I have followed the steps you provided on my live server at "192.248.184.175:9090." However, when I checked the "transcription-app" console, I encountered an error message stating, "Firefox can't establish a connection to the server at wss://192.248.184.175:9090/."

my server should work and here's a screenshot from server
Screenshot from 2024-05-18 20-43-13

Upon reviewing your response, I decided to try installing it on my local PC. Unfortunately, I encountered a similar issue, where Firefox couldn't establish a connection to the server at wss://localhost:9070/.

Screenshot from 2024-05-18 20-44-08

yehiaabdelm commented 2 months ago

Hey, no problem! Can you try without websocket secure... so instead of wss://192.248.184.175:9090/ use ws://192.248.184.175:9090/

Momaiz commented 2 months ago

Hey, no problem! Can you try without websocket secure... so instead of wss://192.248.184.175:9090/ use ws://192.248.184.175:9090/

Thank you, Yehiaa, for your assistance. It appears that the connection has been established, but my server is sending a "101 Switching Protocols" response. Have you encountered this issue before?

yehiaabdelm commented 2 months ago

I think that it's upgrading the connection from http to ws (not an issue/error)? Try clicking "Start Transcribing" start speaking into your microphone. Then go to the network tab -> WS and look for any messages between the backend and frontend.

image
Momaiz commented 1 month ago

Thank you for answer yehiaa, i tried it on localhost and worked fine . but still not working on " website "

i tried to connect to ip directly " but it required ssl to accept media of browser " so i tried to use nginx and connect to it but i got this message Uncaught (in promise) DOMException: The operation is insecure. so i tried to active wss instead of ws but not working , you can check here ip : 45.76.134.101 domain : https://web.markoum.com/ whisper work on port 9090 http://web.markoum.com:9090/

Best regards

yehiaabdelm commented 1 month ago

Yeah, so I didn't know you intended to deploy it. I think in that case you need to configure it so that the transcription server is using websockets secure, see this since you are using nginx: https://stackoverflow.com/questions/12102110/nginx-to-reverse-proxy-websockets-and-enable-ssl-wss

I think the error is because you are on an https website and trying to talk to a server that is not secure.

alessandrv commented 3 weeks ago

Must use wss if I'm hosting website and whisper server on the same machine but the website uses https? Can't connect with wss and without the https won't comunkcate with WS, and without https it won't register audio from browser. Can't find a way to host it on the internet to be accessible remotely