speechmatics / speechmatics-python

Python library and CLI for Speechmatics
https://speechmatics.github.io/speechmatics-python/
MIT License
58 stars 14 forks source link
cli speech-recognition speech-to-text transcription

speechmatics-python   Tests License PythonSupport

Python client library and CLI for Speechmatics Realtime and Batch ASR v2 APIs.

Getting started

To install from PyPI:

pip install speechmatics-python

To install from source:

git clone https://github.com/speechmatics/speechmatics-python
cd speechmatics-python && python setup.py install

Windows users may need to run the install command with an extra flag:

python setup.py install --user

Docs

The speechmatics python SDK and CLI is documented at https://speechmatics.github.io/speechmatics-python/.

The Speechmatics API and product documentation can be found at https://docs.speechmatics.com.

Real-Time Client Usage

from speechmatics.models import *
import speechmatics

# Change to your own file
PATH_TO_FILE = "tests/data/ch.wav"
LANGUAGE = "en"

# Generate an API key at https://portal.speechmatics.com/manage-access/
API_KEY = ""

# Create a transcription client from config defaults
sm_client = speechmatics.client.WebsocketClient(API_KEY)

sm_client.add_event_handler(
    event_name=ServerMessageType.AddPartialTranscript,
    event_handler=print,
)

sm_client.add_event_handler(
    event_name=ServerMessageType.AddTranscript,
    event_handler=print,
)

conf = TranscriptionConfig(
    language=LANGUAGE, enable_partials=True, max_delay=5, enable_entities=True,
)

print("Starting transcription (type Ctrl-C to stop):")
with open(PATH_TO_FILE, "rb") as fd:
    try:
        sm_client.run_synchronously(fd, conf)
    except KeyboardInterrupt:
        print("\nTranscription stopped.")

Batch Client Usage

from speechmatics.models import ConnectionSettings, BatchTranscriptionConfig
from speechmatics.batch_client import BatchClient
from httpx import HTTPStatusError

API_KEY = "YOUR_API_KEY"
PATH_TO_FILE = "example.wav"
LANGUAGE = "en"

# Open the client using a context manager
with BatchClient(API_KEY) as client:
    try:
        job_id = client.submit_job(PATH_TO_FILE, BatchTranscriptionConfig(LANGUAGE))
        print(f'job {job_id} submitted successfully, waiting for transcript')

        # Note that in production, you should set up notifications instead of polling.
        # Notifications are described here: https://docs.speechmatics.com/features-other/notifications
        transcript = client.wait_for_completion(job_id, transcription_format='txt')
        # To see the full output, try setting transcription_format='json-v2'.
        print(transcript)
    except HTTPStatusError as e:
        if e.response.status_code == 401:
            print('Invalid API key - Check your API_KEY at the top of the code!')
        elif e.response.status_code == 400:
            print(e.response.json()['detail'])
        else:
            raise e

Example command-line usage

A complete list of commands and flags can be found in the SDK docs at https://speechmatics.github.io/speechmatics-python/.

Configuring Auth Tokens

SM Metrics

This package includes tooling for benchmarking transcription and diarization accuracy.

For more information, see the asr_metrics/README.md

Testing

To install development dependencies and run tests

pip install -r requirements-dev.txt
make test

Support

If you have any issues with this library or encounter any bugs then please get in touch with us at support@speechmatics.com.


License: MIT