DeepLcom / deepl-node

Official Node.js library for the DeepL language translation API.
MIT License
365 stars 22 forks source link

Deepl Limitations - Timeout Errors during concurrent calls #33

Open razvan-zavalichi opened 1 year ago

razvan-zavalichi commented 1 year ago

Hello,

I am utilizing the PRO Deepl API and have encountered some challenges related to concurrent requests while using the Deepl-Node client.

Here's the scenario: I have a service hosted on Azure, leveraging Azure Functions. Within this setup, there are 15 concurrent functions that attempt to translate a total of 72 KiB of text. The translateText<string[]> function takes an array of strings as input. The interesting aspect is that when these 15 functions are triggered, I encounter 'timeout' errors for all the requests. It's important to note that all these functions run on the same Virtual Machine, which implies that a single Deepl Client is employed to handle these 15 concurrent requests.

My questions are as follows:

Your insights and guidance on optimizing these concurrent translation calls would be greatly appreciated. The service aim to translate ~1 000 000 000 characters.

Please be aware that the translation process functions correctly when a single request is made. This singular request encompasses the following elements:

An array input containing 600 texts (It is noteworthy to mention that the documentation indicates a limit of 50 texts per request.) A maximum of 72 KiB of text, equating to a request payload of approximately 76 KiB (It is important to acknowledge that the documentation specifies a payload size allowance of up to 128 KiB. However, I have observed that if the payload size exceeds 85 KiB, it results in a '413 - Payload Too Large' error).

Could you provide more information regarding the limits documented?

razvan-zavalichi commented 1 year ago

These are the timestamps when the translateText function was called: image

JanEbbing commented 1 year ago

Hi @razvan-zavalichi , thanks for your detailed report!

For your question around how many requests you can send us, we generally advise to limit your usage to 50 requests per second, above that you should see HTTP 429 errors. The library will then use exponential backoff to retry those requests (this behaviour can be configured via the maxRetries and minTimeout values in TranslatorOptions when you construct the Translator object). One possibility I see is that maybe you send too many requests, exponential backoff kicks in and the lower timeout (default is 10s) of the backoff causes you to see timeouts?

I tested the API last week and was able to do 20 QPS sustained without seeing any errors, so to better be able to reproduce your issues I have the following questions:

const textsToTranslate: string[][] = [['First sentence.', 'Second sentence.', 'Third sentence.'], ['Erster Satz.', 'Zweiter Satz.'], ...]; // 15 entries
const startTime = Date.now();
translations = await Promise.all(textsToTranslate.map((texts) => translator.translateText(texts)));
// Do something with translations
const timeDiff = Date.now() - startTime;
if (timeDiff > 0) {
  await sleep(timeDiff); // limit to 15 QPS
}
translations = await Promise.all(textsToTranslate.map((texts) => translator.translateText(texts))); // probably with different texts :)

For your questions:

razvan-zavalichi commented 1 year ago

Hello @JanEbbing!

Here are my responses: Q1: Just to confirm, the timestamps you sent are for a single translateText call, right? Not 15 concurrent ones?

Q2: How exactly do you perform the concurrency? Something like the following?

Q3: By "The current timeout configuration is set to 30000ms.", do you mean the minTimeout value in the TranslatorOptions you use to construct your Translator object?

Q4: As it may have performance implications, roughly what language pair distribution do you have in your request? Is it all the same language pair, or automatic source language detection into the same target language (with the source language being anything provided by users?)

Q5: You mentioned putting everything into a single request solves the issue. So basically when you split this request into 15 smaller ones to adhere to the texts limit of 50, you observe the timeouts?

In my situation, I encountered two significant issues:

More context: If there are 'n' messages in the service bus queues, a maximum of 15 concurrent calls will be made to the 'translateText' function until the queue becomes empty. For instance, if you have 100 messages in the queue:

This behavior signifies that if the DeeplApi consistently delivers translated data within a precise 1-second window for all 15 concurrent calls, the maximum achievable rate is 15 concurrent requests per second. However, in the scenario where the DeeplApi rapidly processes translations, for instance, completing all concurrent translations within just 100 ms, the potential increases to 10 * 15 concurrent calls per second.

JanEbbing commented 1 year ago

Hi, with deepl-node version 1.10.2 and running the following code:

a) 413 Payload Too Large

import * as deepl from 'deepl-node';

const authKey = process.env['DEEPL_AUTH_KEY'];
const serverUrl = process.env['DEEPL_SERVER_URL'];
const translator = new deepl.Translator(authKey, { serverUrl: serverUrl });

const toBeTranslated = "x".repeat(131072);
console.log(Buffer.byteLength(toBeTranslated));
const result = await translator.translateText([toBeTranslated], null, 'fr');
console.log(result.text);

With different string lengths in the "x".repeat(n) I get the following outcomes:

So I can't reproduce that 85 KiB is too large of a request.

b) Timeouts

To simplify, my concurrency logic is a bit different (start 50 requests, wait till they are complete, start 50 new requests), but should still trigger the timeouts.

const authKey = process.env['DEEPL_AUTH_KEY'];
const serverUrl = process.env['DEEPL_SERVER_URL'];
const translator = new deepl.Translator(authKey, { serverUrl: serverUrl, minTimeout: 30000 });
const germanPhrase = "Das ist mein Beispielsatz. Ich untersuche, ob ich mit der API Fehler reproduzieren kann, aber vielleicht funktioniert alles.";
let numRequests = 0;
let results = [];
const startTime = Date.now();
while (numRequests < 1000) {
      const REQUESTS_PER_ITERATION = 50;
      const requestsToMake = [...Array(NUM_REQUESTS).keys()].map((i) => translator.translateText(Array(50).fill(germanPhrase), 'de', 'it'));
      results.push(await Promise.all(requestsToMake));
      numRequests += REQUESTS_PER_ITERATION;
      console.log("Finished making " + REQUESTS_PER_ITERATION + " requests")
}
console.log("Made " + numRequests + " requests in " + (Date.now()-startTime) + " ms");

For me, it finished in ~63s, which comes out to about 15 QPS, without any errors. As you rightly indicated, your method may overload our API, if we have a significant portion of requests that finish fast (order of 100-300ms). I'd recommend to integrate some rate-limiting into your code to ensure you don't go above 50 QPS.

Please let me know if anything in my reproduction attempt is off.

razvan-zavalichi commented 1 year ago

Is it possible that this error is caused by the text format? This is the format of the translation text: "

<table cellspacing=0 cellpadding=0 class..."

I use "deepl-node": "^1.7.2"

JanEbbing commented 1 year ago

Thanks, that helps. For the timeouts, have you tried simply increasing the timeout limit? Even with a simple table with a few rows and columns, a large request like this takes >10s for me, so if you have a complex table/get unlucky with API response time, you can easily hit the 30s window.

I can also reproduce a lower size limit with tag handling, Im following up internally with another team.

razvan-zavalichi commented 1 year ago

Everything works now on my side but following these rules:

  • minTimeout is set to 30000 ( this helps when there are 15 concurrent calls)
  • The translateText function accepts an array of up to 50 texts as input. (According to documentation, even though it occasionally worked with hundreds of texts)
  • The payload cannot be larger than 76 KiB. but it seems that it works on your side with texts greater than 100 KiB.

Anyway, we don't have any text larger than 70KiB at the moment, so there is no blocking point for us. The previous text was taken from a development environment.

razvan-zavalichi commented 1 year ago

Hello @JanEbbing, I've started translating 100k texts, sending no more than 50 texts in one request and no more than 10 requests in parallel ( there are maximum 26 QPs ).

I occasionally encountered the following error:

Service unavailable, message: Temporary Error
Stack: Error: Service unavailable, message: Temporary Error
    at checkStatusCode (/home/site/wwwroot/node_modules/deepl-node/dist/index.js:261:23)
    at Translator.translateText (/home/site/wwwroot/node_modules/deepl-node/dist/index.js:356:15)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) 

This error caused 56 of the 2000 requests to fail. I can increase the number of retries, but if the request timeouts and I retry it, DeeplApi charges for these retries, making it more expensive.

Are there other constraints regarding this?