deepgram / deepgram-js-sdk

Official JavaScript SDK for Deepgram's automated speech recognition APIs.
https://developers.deepgram.com
MIT License
128 stars 46 forks source link

(Error): ws does not work in the browser. Browser clients must use the native WebSocket object #38

Closed jzombie closed 2 years ago

jzombie commented 2 years ago

What is the current behavior?

Followed example for Transcribe Audio in Real-Time and received "Uncaught (in promise) Error: ws does not work in the browser. Browser clients must use the native WebSocket object"

What's happening that seems wrong?

Steps to reproduce

Run this in Chrome:

const { Deepgram } = require("@deepgram/sdk");

const deepgram = new Deepgram(DEEPGRAM_API_KEY);

navigator.mediaDevices.getUserMedia({ audio: true }).then((stream) => {
  const mediaRecorder = new MediaRecorder(stream, {
    mimeType: 'audio/webm',
  });
  const deepgramSocket = deepgram.transcription.live({ punctuate: true });

  deepgramSocket.addListener('open', () => {
    mediaRecorder.addEventListener('dataavailable', async (event) => {
      if (event.data.size > 0 && deepgramSocket.readyState == 1) {
        deepgramSocket.send(event.data)
      }
    })
    mediaRecorder.start(1000)
  });

  deepgramSocket.addListener("transcriptReceived", (received) => {
    const transcript = received.channel.alternatives[0].transcript;
    if (transcript && received.is_final) {
      console.log(transcript);
    }
  });
});

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

Expected behavior

Should connect to WebSocket and start transcribing.

What would you expect to happen when following the steps above?

Please tell us about your environment

Chrome v99

We want to make sure the problem isn't specific to your operating system or programming language.

Mac Monterey / JavaScript / React / Chrome

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

×
Unhandled Rejection (Error): ws does not work in the browser. Browser clients must use the native WebSocket object
new push../node_modules/ws/browser.js.module.exports
node_modules/ws/browser.js:4
new LiveTranscription
src/transcription/liveTranscription.ts:17
  14 |         d.prototype = b === null ? Object.create(b) : (__.prototype = b.prototype, new __());
  15 |     };
  16 | })();
> 17 | var __importDefault = (this && this.__importDefault) || function (mod) {
  18 |     return (mod && mod.__esModule) ? mod : { "default": mod };
  19 | };
  20 | Object.defineProperty(exports, "__esModule", { value: true });
View compiled
Transcriber.live
src/transcription/index.ts:35
  32 |             }
  33 |             op = body.call(thisArg, _);
  34 |         } catch (e) { op = [6, e]; y = 0; } finally { f = t = 0; }
> 35 |         if (op[0] & 5) throw op[1]; return { value: op[0] ? op[1] : void 0, done: true };
  36 |     }
  37 | };
  38 | Object.defineProperty(exports, "__esModule", { value: true });
View compiled
DeepgramSpeechRecognizer._startRecognizing
src/portals/HackathonPortal/services/speechRecognition/provider/Deepgram/DeepgramSpeechRecognizerService/DeepgramSpeechRecognizer.js:90
  87 | this.emit(EVT_CONNECTING);
  88 | 
  89 | this._deepgram = new Deepgram(this._apiKey);
> 90 | this._deepgramSocket = this._deepgram.transcription.live({
     | ^  91 |   puncturate: true,
  92 | });
  93 | this._mediaRecorder = new MediaRecorder(this._mediaStream, {
View compiled
This screen is visible only in development. It will not appear if the app crashes in production.
Open your browser’s developer console to further inspect this error.  Click the 'X' or hit ESC to dismiss this message.
jzombie commented 2 years ago

I assume the issue is related to the WebSocket include in this file. Perhaps it could determine if the browser's WebSocket API is available, and use it instead. I might can make a PR later this week. Using it for a hackathon project.

https://github.com/deepgram/deepgram-node-sdk/blob/main/src/transcription/liveTranscription.ts

SandraRodgers commented 2 years ago

Are you attempting to use the Deepgram Node SDK on the frontend? That's not going to work...

You can open a socket to Deepgram not using the SDK and the deepgramSocket, but instead just using the browser websocket api. Here's an example:

https://github.com/deepgram-devs/browser-mic-streaming/blob/main/index.html#L17

That github repo goes along with the tutorial at https://developers.deepgram.com/blog/2021/11/live-transcription-mic-browser/

Is that helpful to you?

jzombie commented 2 years ago

Yep, absolutely, Thanks for the help here.


A quick suggestion, this package's README might should be updated to not include the DOM example w/ navigator.mediaDevices.getUserMedia if it's only intended to be used in Node.js (as getUserMedia / MediaStream aren't available in Node w/o polyfills).

https://github.com/deepgram/deepgram-node-sdk#transcribe-audio-in-real-time

michaeljolley commented 2 years ago

Really good point @jzombie. We do have on the roadmap a "browserified" version that would be compatible, but we should update that example.

michaeljolley commented 2 years ago

Quick update @jzombie, the latest version of the SDK should allow you to work in the browser.

After you npm install @deepgram/sdk, you should be able to access the Deepgram object via @deepgram/sdk/browser

I haven't given an exhaustive code sample below (as in, I'm not showing how to capture the microphone,) but the code below should provide a good starting point.

import { Deepgram } from '@deepgram/sdk/browser';

const deepgram = new Deepgram('DEEPGRAM_API_KEY');
const deepgramSocket = deepgram.transcription.live({ punctuate: true });

deepgramSocket.onopen = () => {
  if (microphone.state !== 'recording') {
    microphone.addEventListener('dataavailable', async (event) => {
      if (event.data.size > 0 && socket.readyState === 1) {
        deepgramSocket.send(event.data);
      }
    });

    microphone.start(200);
  }
};

deepgramSocket.onmessage = (message) => {
  const received = JSON.parse(message.data);
  const transcript = received.channel.alternatives[0].transcript;
  if (transcript && received.is_final) {
    console.log(transcript);
  }
};

deepgramSocket.onclose = () => {
  console.log('Connection closed.');
};
jzombie commented 2 years ago

@MichaelJolley

Thanks for sending this. Have a good weekend.