Deepl Limitations - Timeout Errors during concurrent calls

razvan-zavalichi commented 1 year ago

Hello,

I am utilizing the PRO Deepl API and have encountered some challenges related to concurrent requests while using the Deepl-Node client.

Here's the scenario: I have a service hosted on Azure, leveraging Azure Functions. Within this setup, there are 15 concurrent functions that attempt to translate a total of 72 KiB of text. The translateText<string[]> function takes an array of strings as input. The interesting aspect is that when these 15 functions are triggered, I encounter 'timeout' errors for all the requests. It's important to note that all these functions run on the same Virtual Machine, which implies that a single Deepl Client is employed to handle these 15 concurrent requests.

My questions are as follows:

Are there any limitations on the number of concurrent calls per Deepl Client or per API Key?
Would it be advisable to reduce the number of concurrent calls? If so, what would be an appropriate threshold in your opinion?
The current timeout configuration is set to 30000ms.

Your insights and guidance on optimizing these concurrent translation calls would be greatly appreciated. The service aim to translate ~1 000 000 000 characters.

Please be aware that the translation process functions correctly when a single request is made. This singular request encompasses the following elements:

An array input containing 600 texts (It is noteworthy to mention that the documentation indicates a limit of 50 texts per request.) A maximum of 72 KiB of text, equating to a request payload of approximately 76 KiB (It is important to acknowledge that the documentation specifies a payload size allowance of up to 128 KiB. However, I have observed that if the payload size exceeds 85 KiB, it results in a '413 - Payload Too Large' error).

Could you provide more information regarding the limits documented?

razvan-zavalichi commented 1 year ago

These are the timestamps when the translateText function was called:

JanEbbing commented 1 year ago

Hi @razvan-zavalichi , thanks for your detailed report!

For your question around how many requests you can send us, we generally advise to limit your usage to 50 requests per second, above that you should see HTTP 429 errors. The library will then use exponential backoff to retry those requests (this behaviour can be configured via the maxRetries and minTimeout values in TranslatorOptions when you construct the Translator object). One possibility I see is that maybe you send too many requests, exponential backoff kicks in and the lower timeout (default is 10s) of the backoff causes you to see timeouts?

I tested the API last week and was able to do 20 QPS sustained without seeing any errors, so to better be able to reproduce your issues I have the following questions:

Just to confirm, the timestamps you sent are for a single translateText call, right? Not 15 concurrent ones?
How exactly do you perform the concurrency? Something like the following?

const textsToTranslate: string[][] = [['First sentence.', 'Second sentence.', 'Third sentence.'], ['Erster Satz.', 'Zweiter Satz.'], ...]; // 15 entries
const startTime = Date.now();
translations = await Promise.all(textsToTranslate.map((texts) => translator.translateText(texts)));
// Do something with translations
const timeDiff = Date.now() - startTime;
if (timeDiff > 0) {
  await sleep(timeDiff); // limit to 15 QPS
}
translations = await Promise.all(textsToTranslate.map((texts) => translator.translateText(texts))); // probably with different texts :)

By "The current timeout configuration is set to 30000ms.", do you mean the minTimeout value in the TranslatorOptions you use to construct your Translator object?
As it may have performance implications, roughly what language pair distribution do you have in your request? Is it all the same language pair, or automatic source language detection into the same target language (with the source language being anything provided by users?)
You mentioned putting everything into a single request solves the issue. So basically when you split this request into 15 smaller ones to adhere to the texts limit of 50, you observe the timeouts?

For your questions:

Re: Documented limit of 50 texts per request. In practice, requests with more texts might work, but we cannot guarantee that this will always be the case.
Re: Request size limit of 128 KiB - I will investigate if I can reproduce the 413 error on 76 KiB requests, this should not be the case. You should definitely be able to send requests that are at least in the range of ~120 KiB.
As I said above, we communicate a limit of 50 requests per second, so your use case should not fall under this restriction.

razvan-zavalichi commented 1 year ago

Hello @JanEbbing!

Here are my responses: Q1: Just to confirm, the timestamps you sent are for a single translateText call, right? Not 15 concurrent ones?

Re: Each timestamp is generated before a translateText call, which means there are more than 30 calls.

Q2: How exactly do you perform the concurrency? Something like the following?

Re: The Service Bus trigger handles the concurrency. I have an Azure function that is triggered by a service bus message, and the service bus is configured to allow for a maximum of 15 concurrent triggers, so I can have up to 15 concurrent calls to my function, resulting in 15 concurrent calls to the translateText function.

Q3: By "The current timeout configuration is set to 30000ms.", do you mean the minTimeout value in the TranslatorOptions you use to construct your Translator object?

Re: Yes, I configured the Translator to use 30000 for minTimeout.

Q4: As it may have performance implications, roughly what language pair distribution do you have in your request? Is it all the same language pair, or automatic source language detection into the same target language (with the source language being anything provided by users?)

Re: DE to IT (always the same source/target language pair)

Q5: You mentioned putting everything into a single request solves the issue. So basically when you split this request into 15 smaller ones to adhere to the texts limit of 50, you observe the timeouts?

Re: The translation is successful if I follow the following rules:
- minTimeout is set to 30000
- The translateText function has a maximum array size of 50 texts as input.
- The payload cannot be larger than 76 KiB. ( it may works with 80 KiB but I use 76 as a safe value)
- In any case, there is a problem because when I tested a single request with a single item in an array with a size of 85 KiB, the response is 413: Payload is too large

In my situation, I encountered two significant issues:

The payload size was initially 85 KiB; however, after conducting thorough manual tests, I determined that a secure threshold is 76 KiB.
- While I can technically send a request with 600 texts, this specific use case becomes unfeasible when dealing with concurrent calls. In practice, using 50 texts for each call proves to be a functional solution

More context: If there are 'n' messages in the service bus queues, a maximum of 15 concurrent calls will be made to the 'translateText' function until the queue becomes empty. For instance, if you have 100 messages in the queue:

Initial 15 concurrent calls
After 5 calls are completed, 5 new concurrent calls are initiated
After 10 calls are completed, another 10 concurrent calls are initiated
After 15 calls are completed, 15 new concurrent calls are initiated
... until queue becomes empty

This behavior signifies that if the DeeplApi consistently delivers translated data within a precise 1-second window for all 15 concurrent calls, the maximum achievable rate is 15 concurrent requests per second. However, in the scenario where the DeeplApi rapidly processes translations, for instance, completing all concurrent translations within just 100 ms, the potential increases to 10 * 15 concurrent calls per second.

JanEbbing commented 1 year ago

Hi, with deepl-node version 1.10.2 and running the following code:

a) 413 Payload Too Large

import * as deepl from 'deepl-node';

const authKey = process.env['DEEPL_AUTH_KEY'];
const serverUrl = process.env['DEEPL_SERVER_URL'];
const translator = new deepl.Translator(authKey, { serverUrl: serverUrl });

const toBeTranslated = "x".repeat(131072);
console.log(Buffer.byteLength(toBeTranslated));
const result = await translator.translateText([toBeTranslated], null, 'fr');
console.log(result.text);

With different string lengths in the "x".repeat(n) I get the following outcomes:

n = 122880 (120 KiB string): Translates normally
n = 131072 (128 KiB string): 413 Payload Too Large error.

So I can't reproduce that 85 KiB is too large of a request.

b) Timeouts

To simplify, my concurrency logic is a bit different (start 50 requests, wait till they are complete, start 50 new requests), but should still trigger the timeouts.

const authKey = process.env['DEEPL_AUTH_KEY'];
const serverUrl = process.env['DEEPL_SERVER_URL'];
const translator = new deepl.Translator(authKey, { serverUrl: serverUrl, minTimeout: 30000 });
const germanPhrase = "Das ist mein Beispielsatz. Ich untersuche, ob ich mit der API Fehler reproduzieren kann, aber vielleicht funktioniert alles.";
let numRequests = 0;
let results = [];
const startTime = Date.now();
while (numRequests < 1000) {
      const REQUESTS_PER_ITERATION = 50;
      const requestsToMake = [...Array(NUM_REQUESTS).keys()].map((i) => translator.translateText(Array(50).fill(germanPhrase), 'de', 'it'));
      results.push(await Promise.all(requestsToMake));
      numRequests += REQUESTS_PER_ITERATION;
      console.log("Finished making " + REQUESTS_PER_ITERATION + " requests")
}
console.log("Made " + numRequests + " requests in " + (Date.now()-startTime) + " ms");

For me, it finished in ~63s, which comes out to about 15 QPS, without any errors. As you rightly indicated, your method may overload our API, if we have a significant portion of requests that finish fast (order of 100-300ms). I'd recommend to integrate some rate-limiting into your code to ensure you don't go above 50 QPS.

Please let me know if anything in my reproduction attempt is off.

razvan-zavalichi commented 1 year ago

Is it possible that this error is caused by the text format? This is the format of the translation text: "

<table cellspacing=0 cellpadding=0 class..."

I use "deepl-node": "^1.7.2"

JanEbbing commented 1 year ago

Thanks, that helps. For the timeouts, have you tried simply increasing the timeout limit? Even with a simple table with a few rows and columns, a large request like this takes >10s for me, so if you have a complex table/get unlucky with API response time, you can easily hit the 30s window.

I can also reproduce a lower size limit with tag handling, Im following up internally with another team.

razvan-zavalichi commented 1 year ago

Everything works now on my side but following these rules:

minTimeout is set to 30000 ( this helps when there are 15 concurrent calls)
The translateText function accepts an array of up to 50 texts as input. (According to documentation, even though it occasionally worked with hundreds of texts)
The payload cannot be larger than 76 KiB. but it seems that it works on your side with texts greater than 100 KiB.

Anyway, we don't have any text larger than 70KiB at the moment, so there is no blocking point for us. The previous text was taken from a development environment.

razvan-zavalichi commented 1 year ago

Hello @JanEbbing, I've started translating 100k texts, sending no more than 50 texts in one request and no more than 10 requests in parallel ( there are maximum 26 QPs ).

I occasionally encountered the following error:

Service unavailable, message: Temporary Error
Stack: Error: Service unavailable, message: Temporary Error
    at checkStatusCode (/home/site/wwwroot/node_modules/deepl-node/dist/index.js:261:23)
    at Translator.translateText (/home/site/wwwroot/node_modules/deepl-node/dist/index.js:356:15)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

This error caused 56 of the 2000 requests to fail. I can increase the number of retries, but if the request timeouts and I retry it, DeeplApi charges for these retries, making it more expensive.

Are there other constraints regarding this?

DeepLcom / deepl-node

Deepl Limitations - Timeout Errors during concurrent calls #33