daily-co / daily-python

Daily Client SDK for Python
BSD 2-Clause "Simplified" License
35 stars 7 forks source link

Feature request: add interim_results and endpoint to start_transcription #11

Closed kylemcdonald closed 7 months ago

kylemcdonald commented 8 months ago

By default, Deepgram transcription is looking for 10ms of audio of silence before a transcription is considered "final".

In my application I'm looking for more like 500ms. If I used the Deepgram API directly I could set this with the "endpointing" parameter:

https://developers.deepgram.com/docs/endpointing

However the existing SDK does not support this parameter:

{"timestamp":"2023-11-04T13:32:34.271879Z","level":"ERROR","fields":{"message":"startTranscription (request 1) encountered an error: Transcription settings are not valid: JsonApiError { message: \"unknown field `endpointing`, expected one of `language`, `model`, `tier`, `profanity_filter`, `redact`\" }"},"target":"daily_core::native::ffi::call_client"}

I would like to request the addition of the "endpoint" parameter as well as the boolean "interim_results" parameter.

aconchillo commented 8 months ago

Thank you @kylemcdonald ! We had similar feature requests regarding transcriptions so we will start improving it and should have something fairly soon.

TimHeckel commented 8 months ago

@aconchillo - any update on this one?

aconchillo commented 8 months ago

@aconchillo - any update on this one?

Actually, yes! Next daily-python release will come with transcription improvements including the ability to pass any Deepgram setting. This is expected to be released in a couple of weeks (next week is a complicated week because of 🦃 ). Thank you for your patience!

aconchillo commented 7 months ago

Hi! I'm happy to announce that daily-python 0.5.0 is now available. With this new release there are a couple of changes:

For example:

client.start_transcription({
  'extra': {
    'endpointing': 300
  }
})