aws transcribe output - Githubissues

awslabs / amazon-transcribe-streaming-sdk

The Amazon Transcribe Streaming SDK is an async Python SDK for converting audio into text via Amazon Transcribe.

Apache License 2.0

142 stars 37 forks source link

aws transcribe output #41

Closed haldernayan closed 3 years ago

haldernayan commented 3 years ago

aws transcribe is working fine, however the output is like this This. This is This is a This is a te This is a test. This is a test.

I only want the output as This is a test.

I understand the repeatation line comes as streaming is very fast. how to do streaming so that one word go only once.

What changes need to be done in program simple_file.py orsimple_mic.py.

nateprewitt commented 3 years ago

Hi @haldernayan, the example provided in the Amazon Transcribe Streaming SDK documentation is intended to mirror the official Transcribe documentation. You'll find that each Result object we return here has an is_partial flag on it. This denotes whether the translation results are complete. If you only output results with result.is_partial is False, it should be the complete lines you're referencing.

Manjunath-PM commented 3 years ago

Hi @haldernayan, the example provided in the Amazon Transcribe Streaming SDK documentation is intended to mirror the official Transcribe documentation. You'll find that each Result object we return here has an is_partial flag on it. This denotes whether the translation results are complete. If you only output results with result.is_partial is False, it should be the complete lines you're referencing.

hello nateprewitt can u explain neatly, how it can be done, even i need only the last output not all the text

nateprewitt commented 2 years ago

@Manjunath-PM, you would use the handler from the example linked above with the suggested conditional.

e.g.

class MyEventHandler(TranscriptResultStreamHandler):
    async def handle_transcript_event(self, transcript_event: TranscriptEvent):
        results = transcript_event.transcript.results
        for result in results:
            if result.is_partial is False:
                for alt in result.alternatives:
                    print(alt.transcript)