shivadharmi / stt-grpc-service

Speech-to-Text service using gRPC and the Whisper model from Hugging Face's Transformers library. The service allows users to send audio data and receive transcribed text in response.
MIT License
1 stars 0 forks source link
grpc llm speech-to-text whisper-ai

Speech-to-Text Service

Overview

Speech-to-Text service using gRPC and the Whisper model from Hugging Face's Transformers library. The service allows users to send audio data and receive transcribed text in response.

Features

Requirements

Setup Instructions

  1. Clone the Repository

    git clone https://github.com/your-username/speech-to-text-service.git
    cd speech-to-text-service
  2. Install Dependencies Make sure to install the required packages as mentioned above:

    pip install -r requirements.txt
    pip install -r requirements-dev.txt
  3. Install Pre-commit To install pre-commit, run:

    pip install pre-commit
  4. Environment Variables Create a .env file in the root directory and set the following variables for local environment:

    ENVIRONMENT=local
    PORT=50051

    For non-local environments (development or production), set the following variables:

    ENVIRONMENT=development  # or production
    SSL_PRIVATE_KEY_PATH=<path-to-private-key>
    SSL_CERTIFICATE_CHAIN_PATH=<path-to-certificate-chain>
  5. Run the gRPC Server Start the server by running:

    python main.py
  6. Run the Client In a separate terminal, run the client to send audio data:

    python src/client/client.py
  7. Run Pre-commit Hooks To ensure code quality, install and run pre-commit hooks:

    pre-commit run --all-files
  8. Code Formatting and Linting Use ruff for linting and black for code formatting:

    ruff check .
    black .

Testing

To run the unit tests, execute the following command:

python -m unittest discover -s src/tests

Audio Data

Place your test audio files in the data directory. The client will look for test_audio.wav by default.

Contributing

Feel free to submit issues or pull requests. Contributions are welcome!

License

This project is licensed under the MIT License.