A streaming speech to text demo feature, taking input from user's microphone, sending it to Whisper's wait-k model, and displaying the prediction texts in the terminal.
Related issue: #54
How to start STT streaming
1. Build and run the docker container
First change into the directory containing the Dockerfile:
cd examples/speech_to_text
Then, build the Docker image with:
docker build -t simuleval-speech-to-text:1.0 .
Next, run the remote evaluation server using the Docker image:
docker run -p 8888:8888 simuleval-speech-to-text:1.0
This binds port 8888 of the container (server) to port 8888 on the local machine (client).
OR
1. Kick off a standalone whisper agent for remote translation:
Description
A streaming speech to text demo feature, taking input from user's microphone, sending it to Whisper's wait-k model, and displaying the prediction texts in the terminal.
Related issue: #54
How to start STT streaming
1. Build and run the docker container First change into the directory containing the Dockerfile:
Then, build the Docker image with:
Next, run the remote evaluation server using the Docker image:
This binds port 8888 of the container (server) to port 8888 on the local machine (client).
OR
1. Kick off a standalone whisper agent for remote translation:
2. Enter demo mode by providing a desired segment size (usually 500ms):
3. Speak into the microphone and watch the live transcription!
4. Press ^c (Control C) to exit the program in terminal
Type of change
How Has This Been Tested?
Tested locally.
Test Configuration: