[Feature-Request] Set a parallel translation limit for Custom Translator

saturnsky commented 3 months ago

Description

Some local translators use local computing power, which means they can't translate an unlimited amount of text simultaneously. For example:

CPU-based local translators may experience performance issues if trying to translate beyond the CPU's thread limit.
Translators using local LLMs often struggle to handle a large number of concurrent requests.

Considering these limitations, it's necessary to restrict the number of translation requests processed at any given moment.

Proposed Solution

Implement an option to limit concurrent translation requests. For example:

If the request limit is set to 8, the system would:
1. Initially send translation requests for 8 sentences.
2. As each translation completes, send an additional request.

This approach would maintain a constant number of active translation requests without overloading the system.

Benefits

This option would be beneficial for many types of Custom Translators, especially those with limited computing resources or those using local models.

vitonsky commented 3 months ago

@saturnsky what about method getRequestsTimeout? You may control time between requests.

Also, you may control how to handle requests with custom module methods implementation. Just collect queue and handle this queue as slow as you wish.

One more note is you should implement method translateBatch as efficient as possible to avoid problems. If you just call translation for every text from array, then you may have 3-9k requests in queue when you click translate average web page.

Did you optimized your code in this aspects?

saturnsky commented 3 months ago

Thank you for your detailed response. I'd like to elaborate on my thoughts regarding the proposed feature and address the points you've raised.

Regarding `getRequestsTimeout`

While getRequestsTimeout is indeed helpful for managing rate limits with online translators, it may not be as effective for offline translators. The unpredictable nature of translation time for offline translators (e.g., 100ms for one sentence, 1 second for another) makes it challenging to set a fixed timeout. For offline translators, a more appropriate approach would be to send the next request as soon as a translation result is received, rather than managing request frequency.

On `translateBatch`

I agree that translateBatch can be beneficial for online translators, where combining multiple sentences into a single chunk for server transmission and then splitting the results is efficient. However, for many offline translators, this approach may not be ideal:

Offline translators often have computational costs proportional to sentence length.
translateBatch in this context might only increase latency until results appear on screen, without improving overall throughput.
Error handling becomes more complex and costly if issues occur during the translation of batched sentences.

Given these considerations, I think this is the most appropriate:

Adding an option to disable translateBatch for Custom Translators.
Implementing a parallel request limit using methods like Semaphore.

This approach would likely yield the best results for offline translators.

Of course, limiting parallel requests can be implemented directly by individual users in their Custom Translators using Semaphores, etc., so it is a lower priority issue than disabling translateBatch. However, since this scenario is common when using offline translators, I thought it would be more convenient to provide it as an option in linguist. And there is already an issue about Custom Translator not using translateBatch (issue #236). That is why I only wrote an issue about parallel request limit.

translate-tools / linguist