tempo-riz / deepgram_speech_to_text

A Deepgram client for Dart and Flutter, supporting all Speech-to-Text and Text-to-Speech features on every platform.
https://pub.dev/packages/deepgram_speech_to_text
MIT License
3 stars 8 forks source link

Question about connecting to a stream #8

Closed scottforsyth closed 4 months ago

scottforsyth commented 4 months ago

Hello Thibault,

I successfully got your Deepgram package to send and process a .wav file. It works great. It's very easy and useful.

However, I'm trying to work with a stream but haven't had any luck. I may be missing a simple concept. I'm quite new to Flutter/Dart.

I'm using version 1.0.5 due to dependency requirements on my current project, so I can't upgrade to the latest.

I tried with my own stream. I insert frames into that stream.

Then, my code is pretty simple.

At the class level:

  late Deepgram deepgram;
  late DeepgramLiveTranscriber transcriber;
  StreamController<List<int>> _deepgramAudioStreamController = StreamController<List<int>>();
  Map<String, dynamic> params = {
    'model': 'nova-2-general', // or conversationalai
    'detect_language': false,
    'language': 'en-US',
    'smart-format': true,
    'encoding': 'linear16',
    'filler_words': false,
    'punctuation': true,
    'utterence_end_ms': 1000,
  };

Within init: deepgram = Deepgram(deepgramApiKey, baseQueryParams: params);

Within a different method, I'm filling _deepgramAudioStreamController.stream with the stream frames. I believe that's working well since I have it working for a couple of other things that do work.

And, within a _startDeepgramListening() method:

transcriber = deepgram.createLiveTranscriber(_deepgramAudioStreamController.stream);
  transcriber.start();
  transcriber.jsonStream.listen((json) {
    print(json);
  });

And, in a separate method called _processRecordedAudio(): transcriber.close();

Should I expect that a breakpoint on the print(json) line would catch? It doesn't catch and it doesn't print anything. No errors are sent either though. Note that sending a wav file does work.

I also tried a simple option with all the code in the _startDeepgramListening() method:

  Stream<String> jsonStream = deepgram.transcribeFromLiveAudioStream(_deepgramAudioStreamController.stream);
  transcriber.start();
    transcriber.jsonStream.listen((json) {
      print(json);
    });

However, nothing prints.

Am I missing something dumb? I'm not sure where in the Deepgram docs I can find more info since all the examples seem to use their components (JavaScript, C#, Go, etc.).

I see in the Deepgram dashboard that the count of requests increases for each attempt. I don't know how to wire up the callback.

Thanks! Scott

tempo-riz commented 4 months ago

Hey @scottforsyth, the code samples you showed seem fine and yes the breakpoint should trigger.

Are you sure the data you put in your input stream is legit? Does it have the same encoding and rate that you specified in the params ?

I made a Flutter example that you can start from: Flutter Example.

I’m also doing something similar to you in the package's test: read an audio file once, then create a stream where I put that data, take a look here: Deepgram Test.

Hope it helps! Don't hesitate to reach out if you stay stuck :)

scottforsyth commented 4 months ago

Hey Thibault,

You're absolutely right. I used your example and it worked right away.

As you suggested, I did have my custom stream wrong. I needed to convert it to pcm16bits. I thought that it was, but it wasn't. After updating that (my code below), it works beautifully with my custom stream.

  Uint8List _convertToPcmS16LE(List<int> frame) {
    ByteData byteData = ByteData(frame.length * 2);
    for (int i = 0; i < frame.length; i++) {
      byteData.setInt16(i * 2, frame[i], Endian.little);
    }
    return byteData.buffer.asUint8List();
  }

Thanks for your quick reply and pointing me in the right direction. And thanks again for providing the Deepgram package!