deepgram / deepgram-js-sdk

Official JavaScript SDK for Deepgram's automated speech recognition APIs.
https://developers.deepgram.com
MIT License
127 stars 45 forks source link

Keep the Connection Alive with Prerecorded API and Readable Stream #229

Closed 8ta4 closed 5 months ago

8ta4 commented 5 months ago

Proposed changes

I want to make the connection last longer when using the prerecorded API and a readable stream.

Context

I'm building an open-source project called say. It transcribes voice 24/7.

I want to cut down the latency as much as possible, but keep the accuracy of the prerecorded API.

I tried using a readable stream to send audio data with Deepgram's Node SDK. I thought it would keep the connection open as more audio gets recorded. But it doesn't work.

I told my partner about this. I said, "It won't listen for more than a few seconds!" She said, "That's just you." I said, "Huh? Sorry, I wasn't listening."

Other information

You can check out the full discussion here.

Any suggestions are welcome.

lukeocodes commented 5 months ago

The request won't be sent until all the file is read into memory. It's a way to load the raw data from a file.

You can either use the websocket for live audio, or send us smaller files at a time. You can use ffmpeg to separate the files and maintain the headers. File headers should be at the top of the file.

The resulting time to send is related to filesize, download speed at our service, and upload speed where your app is hosted. We have had occasions where latency has been an issue in testing, due to the upload speed of the development or local environment involved.