awslabs / amazon-transcribe-streaming-sdk

The Amazon Transcribe Streaming SDK is an async Python SDK for converting audio into text via Amazon Transcribe.
Apache License 2.0
140 stars 38 forks source link

Add headers for multiple language identification #99

Open mbatchkarov opened 5 months ago

mbatchkarov commented 5 months ago

Issue #, if available: N/A

Description of changes: Hi, Transcribe SDE here. We recently launched a new feature called multiple language identification. We've been asked to contribute to this package to enable the feature so customers can use it.

Notes language_code is currently a required positional parameter. When language ID is added, language code should become optional. That means we'd have to make it the third param and give it a default value, but this would break existing clients that use positional-only arguments. Therefore I'm leaving it as a positional arg and requiring that it's set to None when language ID is enabled. I'm not too happy with this option either- happy to discuss. Maybe something like this would be more ergonomic:

        if identify_language or identify_language:
            warnings.warm("Setting language_code to None because language ID is enabled")
            language_code = None

Testing I extended and ran the integration tests locally from my own AWS account.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

mbatchkarov commented 5 months ago

Update: just saw #89 - looks like plain language ID is also not supported. Will update the PR to include that too

praveenXira commented 4 months ago

When will this PR get merged? I really need multiple language identification feature.

GameSetAndMatch commented 3 months ago

Definitely a feature I would use in the short term if it was reviewed and merged by AWS team, good work @mbatchkarov !