deepgram / deepgram-js-sdk

Official JavaScript SDK for Deepgram's automated speech recognition APIs.
https://developers.deepgram.com
MIT License
156 stars 54 forks source link

Transcription finalized, but speech is never finalized even when explicitly requested #343

Closed K-Mistele closed 3 weeks ago

K-Mistele commented 3 weeks ago

What is the current behavior?

Currently using the Node.js ListenLiveClient for websockets streaming-based transcription, sometimes I encounter an issue where a transcription is finalized (is_final: true), but I never receive a transcription update with speech_final: true.

What's happening that seems wrong?

At some point, after a transcription is finalized, the speech should be finalized as well. requesting finalization with listenLiveClient.finalize() does not result in the speech being finalized, either.

Steps to reproduce

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

This is my ListenLiveClient configuration:

const connection = this.client.listen.live({
    encoding: 'mulaw',
    sample_rate: 8000,
    model: 'nova-2',
    punctuate: true,
    interim_results: true,
    endpointing: 300,
    utterance_end_ms: 1000,
    language: 'en-US',
    smart_format: true
})

I am sending audio from twilio (G.711 u-law 8kHz, base64-encoded) to the listen live client, and receiving transcription events.

Expected behavior

What would you expect to happen when following the steps above?

I expect to receive intermediate / interim transcriptions, and then a transcription with is_final set to true. If speech_final is set to true, then I use the result. if speech_final is set to false, I expect to receive another transcription event in the future that contains speech_final set to true.

To attempt to fix this problem, I attempt to manually request finalization with connection.finalize(), but no speech_final is ever received

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

lukeocodes commented 3 weeks ago

This is a usage problem and not an SDK issue so I will close this ticket.

In the meantime, please check out this doc: https://developers.deepgram.com/docs/understand-endpointing-interim-results

If you'd like help using our products, you can ask us questions in our developer communities; links to those found on https://community.deepgram.com