deepgram / deepgram-dotnet-sdk

.NET SDK for Deepgram's automated speech recognition APIs.
https://developers.deepgram.com
MIT License
28 stars 31 forks source link

SDK Re-architecture Major Release to V4 #130

Closed jpvajda closed 10 months ago

jpvajda commented 1 year ago

We are improving our SDK architecture to make it easier to use, maintain and support multiple endpoints as the Deepgram product expands. Each of Deepgram's Official SDKs will follow this pattern.

This issue is to capture the requirements so a member of our community could assist us in this implementation.

Deepgram DX Team

@SandraRodgers @jpvajda @lukeocodes

Discord channel

Discussions about this can happen in the Community-Maintainer channel

Reference Implementation

See the approach taken in the Node SDK for guidance: https://github.com/deepgram/deepgram-node-sdk/pull/154

Specification & Approach

Instantiation

Using whichever import or namespacing is required to get a createClient function from your Deepgram SDK (yes function not class), we return a DeepgramClient type/class.

It can be passed an api_key as a string.

Without an API key parameter, it should check the current environment for DEEPGRAM_API_KEY as an environment variable.

This is what the SDKs top most function to instantiate the SDK and connect to the API might look like.

// createClient

function createClient(apiKey: string, options: object) {
  return new DeepgramClient(apiKey, options);
};

Usage of this function like so;

var deepgram = createClient("lo3hu34tbi34tbj34ob234t2t123rt2")

Deepgram Client

The DeepgramClient is responsible for handling default options and passed in options, as well as surfacing possible functions, each returning a client for the API we’re going to be making requests to.

// DeepgramClient

class DeepgramClient {
  constructor(apiKey, options) {
    // do some magic with options, and defaults
    // create an API URL, configure a websocket (don't connect), etc...
  }

  get listen() {
    // pass into ListenClient all the things required for various listen requests. 
    // like a http client, or websocket
    // also take the options.listen object from the passed in user options, so they can affect this client

    return new ListenClient(url, httpClient, websocketClient, headers, options.listen)
  }

  // get read() {}

  // get speak() {}

  // get interact() {}

  get projects() {
    // pass into ProjectsClient all the things required for various projects requests.
    // also take the options.projects object from the passed in user options, so they can affect this client

    return new ProjectsClient(url, httpClient, headers, options.projects)
  }
}

Listen Client

The ListenClient separates the PrerecordedClient and LiveClient. This is because the LiveClient is purely an EventEmitter and container for the websocket connection. The PrerecordedClient, like our other clients, will use an AbstractRestfulClient to be able to make POST/GET/PUT/DELETE requests. 

class ListenClient {
  constructor(url, httpClient, websocketClient, headers, globalOptions) {
    // store these to the object for use in calls
  }

  get prerecorded() {
    // pass into PrerecordedClient all the things required for requests to listen
    // like a fetch class
    // also take options.listen object from the passed in user options

    return new PrerecordedClient(url, httpClient, headers, globalOptions.listen)
  }

  public live(transcribeOptions) {
    // pass into LiveClient all the things required for a websocket connection. 
    // like a websocket
    // also take the options.listen object from the passed in user options, so they can affect this client

    return new LiveClient(url, transcribeOptions, websocketClient, headers, globalOptions.listen)
  }
}

Usage of the listen client here is like so:

var prerecordedClient = await deepgram.listen.prerecorded
var liveConnection = await deepgram.listen.live(options)

Live Client

This is a websocket client. It should be capable of emitting all our events, and having the existing functionality on it. We should be able to pass in a mock websocket client, so it is capable of connecting to a local websocket server in testing (very important).

It’s possible the current implementation of our LiveClient can be reused. But, it should support testing.

class LiveClient {
  constructor(url)
}

Usage of the live client.


var live = deepgram.listen.live(liveTranscriptionOptions)

live.addListener("data", (transcription) => {
  // something come back from Deepgram
});

// send audio to transcribe
live.send(fileData);

// send JSON
live.send("{ 'type': 'KeepAlive' }")

// check websocket state
live.state()
jpvajda commented 10 months ago

See https://github.com/deepgram/deepgram-sdk-architecture/ for design approach