deepgram / deepgram-js-sdk

Official JavaScript SDK for Deepgram's automated speech recognition APIs.
https://developers.deepgram.com
MIT License
124 stars 44 forks source link

[Automation] AbortSignal and Timeout Property #280

Open dennisofficial opened 1 month ago

dennisofficial commented 1 month ago

Proposed changes

Provide a detailed description of the change or addition you are proposing

The invoke methods for asynchronously transcribing or etc, should have a signal property.

Context

Some models, specifically whisper sometimes takes a long time to execute, especially for our long files. We also have a concurrent request happening at the same time, which if it fails, this transcription is no longer needed, and we need a way to abort it.

Why is this change important to you? How would you use it? How can it benefit other users?

Possible Implementation

Add a signal property to the schema options, and handle it with the restful client.

Not obligatory, but suggest an idea for implementing addition or change

Other information

Anything else we should know? (e.g. detailed explanation, related issues, links for us to have context, eg. stack overflow, codepen, etc)

lukeocodes commented 1 month ago

You can use _experimentalCustomFetch on the latest version to provide a custom fetch transport, including your abort.

We have a PR up to allow for more granular configuration of fetch

dennisofficial commented 1 month ago

You can use _experimentalCustomFetch on the latest version to provide a custom fetch transport, including your abort.

We have a PR up to allow for more granular configuration of fetch

Yes I tried that. However, the parameters I get are objects to setup the fetch request. It seems like the properties when invoke 'transcribeFile' is being cleaned out. So even thought I pass signal and timeout, I don't see it when being passed to the experimental fetch.

dennisofficial commented 1 month ago

Oh no, It was my fault, I added the signal variable into the wrong line. Either way, it's difficult to use since the parameters are parsed into a URLSearchParams, and turns it into a string:

Here is the code:

createClient(envService.get('DEEPGRAM_API_KEY'), {
  _experimentalCustomFetch: (input: URL, init: RequestInit) => {
    console.log(input, init);
    return fetch(input, init);
  },
}),

Here is the logs:

URL {
  href: 'https://api.deepgram.com/v1/listen?smart_format=true&signal=%5Bobject+AbortSignal%5D&model=nova-2',     
  origin: 'https://api.deepgram.com',
  protocol: 'https:',
  username: '',
  password: '',
  host: 'api.deepgram.com',
  hostname: 'api.deepgram.com',
  port: '',
  pathname: '/v1/listen',
  search: '?smart_format=true&signal=%5Bobject+AbortSignal%5D&model=nova-2',
  searchParams: URLSearchParams {
    'smart_format' => 'true',
    'signal' => '[object AbortSignal]',
    'model' => 'nova-2' },
  hash: ''
},
{
  headers: HeadersList {
    cookies: null,
    [Symbol(headers map)]: Map(4) {
      'content-type' => [Object],
      'x-client-info' => [Object],
      'user-agent' => [Object],
      'authorization' => [Object]
    },
    [Symbol(headers map sorted)]: null
  },
  method: 'POST',
  body: <Buffer 00 00 00 1c 66 74 79 70 4d 34 41 20 00 00 00 00 4d 34 41 20 6d 70 34 32 69 73 6f 6d 00 00 54 a1 6d 6f 6f 76 00 00 00 6c 6d 76 68 64 00 00 00 00 e2 07 ... 1751945 more bytes>,
  duplex: 'half'
}
dennisofficial commented 1 month ago

Which makes sense, if you look at the transcribeFile method:

  async transcribeFile(
    source: FileSource,
    options?: PrerecordedSchema,
    endpoint = "v1/listen"
  ): Promise<DeepgramResponse<SyncPrerecordedResponse>> {
    try {
      let body;

      if (isFileSource(source)) {
        body = source;
      } else {
        throw new DeepgramError("Unknown transcription source type");
      }

      if (options !== undefined && "callback" in options) {
        throw new DeepgramError(
          "Callback cannot be provided as an option to a synchronous transcription. Use `transcribeUrlCallback` or `transcribeFileCallback` instead."
        );
      }

      const transcriptionOptions: PrerecordedSchema = { ...{}, ...options };

      const url = new URL(endpoint, this.baseUrl);
      appendSearchParams(url.searchParams, transcriptionOptions);

      const result: SyncPrerecordedResponse = await this.post(this.fetch as Fetch, url, body, {
        "Content-Type": "deepgram/audio+video",
      });

      return { result, error: null };
    } catch (error) {
      if (isDeepgramError(error)) {
        return { result: null, error };
      }

      throw error;
    }
  }
lukeocodes commented 1 month ago

Does this mean you're unblocked? I have a PR up to formalise the method of using custom fetch and websocket clients. It will be fowards compatible with _experimentalCustomFetch

Check out the PR for lo/namespace-configs. If you have any suggestions please add them to that PR