awslabs / amazon-transcribe-streaming-sdk

The Amazon Transcribe Streaming SDK is an async Python SDK for converting audio into text via Amazon Transcribe.
Apache License 2.0
140 stars 38 forks source link

Can i use this sdk for multispeaker(2)? #63

Open aj7tesh opened 2 years ago

aj7tesh commented 2 years ago

I have been trying to run this sdk for multispeaker transcript generation from a wav file. However the results are really poor, not even a single word is identified correctly. Am i missing something or this sdk doesnt support 2 speakers

aj7tesh commented 2 years ago

is there a way to generae transcript when we have 2 speaker speaking in 2 languages where speaker 1 speaks in 2 languages. i.e code switching asr

DaaS-20xx commented 2 years ago

Same issue here. Using only parameter "show_speaker_label=True" in input to client.start_stream_transcription, does not work. Actually the output is generated and the "transcript" parameter within the "alternative[0]" objects is returned, as well as the "speaker" parameter from "alternatives[0].items[0].speaker". But, as [aj7tesh said, there is NO match at all with the speech. It seems a totally casual and different audio at all. Is there any other parameters configuration to be given in input?