Open ernestodossantos opened 3 years ago
@ernestodossantos, could you provide more detail around the use case and what AWS services would the SDK interact with? Thank you.
Hello @joshongithub,
We need to perform transcriptions in real time during a phone call. Our automated system process calls, interacting with the caller through speech recognition (Amazon Transcribe) and text to speech (Amazon Polly). However, currently we can only do this speech recognition uploading the recorded speech to S3 and then running the recognition from there. Doing this in real time would be a huge improvement for us.
@ernestodossantos - thank you for the detail. Do you know if any of the other AWS SDKs provide this support? I'm looking for an example of how this support has been implemented elsewhere. I see we have API support for both Amazon Polly and Amazon Transcribe, but it sounds like to implement this feature would require a new library that combines the two.
Amazon Polly is fine, it can be used separately. The problem is with speech recognition, because we need to stream the audio in real time, and get the recognition back in real time. The service we need to use is the following: https://docs.aws.amazon.com/transcribe/latest/dg/streaming.html
But it is not implemented in all client SDKs. According to this page, this is only available for C++, Java and Ruby.
Here's a usage example for Java: https://github.com/aws-samples/aws-transcribe-streaming-example-java
Our applications are built with .NET Core (actually .NET 5.0), so we would need this in the .NET SDK to be able to use it.
Thank you for the detail, I understand the issue now and I'll forward the details of your request to the service team.
Thank you, much appreciated!
This feature would be very helpful for me as well The feature was added to the Go SDK in 2020 (https://github.com/aws/aws-sdk-go/pull/3048)
The .NET SDK doesn't support streaming transcription. This is a very important feature for us. Is this something you're considering?