deepgram / deepgram-js-sdk

Official JavaScript SDK for Deepgram's automated speech recognition APIs.
https://developers.deepgram.com
MIT License
138 stars 51 forks source link

Importing @deepgram/sdk/browser is throwing: SyntaxError: Unexpected token 'export' #119

Closed lachlansleight closed 1 year ago

lachlansleight commented 1 year ago

What is the current behavior?

Literally a duplicate issue of #89, as I'm having the same problem, it's a shame that the neither of the error reporters ever responded. I'll try to be more useful!

Adding import { Deepgram } from "@deepgram/sdk/browser"; results in t he error SyntaxError: Unexpected token 'export' - this is in a next.js project.

Steps to reproduce

I've created an absolutely bare-minimum next.js site to demonstrate that this issue occurs in full isolation. Full recreation steps are as follows (should take no more than two minutes to setup)

Inside index.tsx, put the following (adding your API key inside the useEffect callback, of course):

import { Deepgram } from "@deepgram/sdk/browser";
import { useEffect } from "react";

const IndexPage = (): JSX.Element => {

    useEffect(() => {
        const deepgram = new Deepgram(A_VALID_DEEPGRAM_API_KEY || "");
    }, []);

    return (
        <div>
            <h1>This is an extremely minimal next.js site</h1>
        </div>
    )
}

export default IndexPage;

Expected behavior

I'd expect to see nothing happen since I haven't begun any transcriptions, but I'd certainly expect the page to not crash!

Please tell us about your environment

aakhtar76900 commented 1 year ago

Facing same issue

jeremyadamsfisher commented 1 year ago

Same issue

lukeocodes commented 1 year ago

Hey folks, we'll be deprecating the browser part of the SDK. We'll be raising a new major version without this code in the coming weeks, to prevent breaking anything that may be working elsewhere.

While our API doesn't currently support another authentication method for transcription, we feel it isn't responsible to keep suggesting users elevate an API token to the client side.

Feel free to raise a PR with a work-around in the meantime, or fork this version of the SDK and maintain the browser SDK.

We do have a plan to release a stand-alone client SDK with a future release of the API.

lachlansleight commented 1 year ago

Do you have any suggestions regarding a workaround? Or any word on when a client SDK might become available?

lukeocodes commented 1 year ago

Do you have any suggestions regarding a workaround? Or any word on when a client SDK might become available?

If you're using Next.js and can use API Routes, you could use that to proxy the node-sdk and keep your key secure. That is how I would build it. Here I communicate with API server from React for our JS starter application.

If you're planning to stream audio from something like a microphone, you'd need to open a websocket to send that audio to the API Route. I believe this can be achieved with an API Route and socket.io. I don't have an example yet, though.

Let me know how you get on!

lukeocodes commented 1 year ago

@lachlansleight we also have a serverless blog post by my old colleague Kevin making serverless transcriptions, specifically. https://blog.deepgram.com/transcription-netlify-functions/

We have other guide for Vue using serverless. We have a plan to make some Next.js guides, too.

lachlansleight commented 1 year ago

@lukeocodes Yeah I tried working around by streaming audio directly to a separate node app via WebSockets (using socket.io), but I was only getting empty transcription results back.

I think it might be a format issue but after bashing my head against it all day I ended up throwing in the towel. I was setting the encoding to opus and the sample rate to what the web frontend was outputting, but still getting empty transcription results back. 🤔

I was just sending the raw bytes that come out of the ondataavailable MediaRecorder callback - is there perhaps something I should be doing to them before I send them via socket.io do you think?

lukeocodes commented 1 year ago

I was just sending the raw bytes that come out of the ondataavailable MediaRecorder callback - is there perhaps something I should be doing to them before I send them via socket.io do you think?

I've asked a colleague how they did it with websockets. I am sure we have a working example of exactly what you're trying to do. Probably expect a response tomorrow 👍🏻

lachlansleight commented 1 year ago

That would be great - if I get it working I'll post it here so that people who come across the error can use it as a workaround too :)

lukeocodes commented 1 year ago

That would be great - if I get it working I'll post it here so that people who come across the error can use it as a workaround too :)

thanks, appreciate it!

I just found this, which uses a websocket to send video back to the Node SDK. Might help? https://github.com/deepgram-devs/video-chat

lachlansleight commented 1 year ago

Alright I got it working - the issue was resolved by a comment in that repo you linked stating that the very first packet sent from the audio recorder must be sent to deepgram for transcription to work.

So I did that, as well as removed the code that explicitly set the codec / sample rate of Deepgram, and my node server is now successfully forwarding audio to Deepgram and getting transcriptions back!

Specifically, what must be done on the server-side is something like this:

// ==== This is within the node.js app ====
// This is all happening after we've established a websocket connection to the client ("clientSocket")

// Initialize websocket connection to DeepGram
const deepgramSocket = deepgram.transcription.live({ 
    punctuate: true,
});

deepgramSocket.addListener("open", () => {
    // only once deepgram is ready to receive audio do we want the client to start recording
    clientSocket.emit("ready");

    // and we also set up our audio listener to forward data
    clientSocket.on("audio", data => {
        deepgramSocket.send(data);
    });
});

// Forward transcription results back to the front-end client app
deepgramSocket.addListener("transcriptReceived", json => {
    const transcription = JSON.parse(json);
    socket.emit("transcription", transcription);
}

// Close the connection to Deepgram when the client disconnects
clientSocket.on("disconnect", () => deepgramSocket.finish());
// ==== This is within the client app ====

// Only once the server sends that ready message do we begin recording
// This is so that the initial packets (which I suppose contain codec info and such)
// are properly sent to the server
// Note that 'recorder' is an instance of MediaRecorder
this.socket.on("ready", () => {
    console.log("Starting audio recording");
    this.recorder?.start(250);
})

// Pipe raw audio binary data to the node.js server,
// which then forwards that data to deepgram
this.recorder?.addEventListener("dataavailable", e => {
    this.socket?.emit("audio", e.data);
});

// Receive transcription results!
this.socket.on("transcription", data => console.log("Transcription Result:", data));

Will definitely be looking out for a proper client SDK (since this requires me to build, maintain and pay for a node.js app), but this is fine as a workaround for now :)

lukeocodes commented 1 year ago

Neat. We plan for a client SDK but no dates on that. I will give this a go, and possibly pop it into a guide or blog post for the site. If I do, i'll be sure you drop you some kudos. Thanks again!