XapaJIaMnu / translateLocally

Fast and secure translation on your local machine, powered by marian and Bergamot.
MIT License
504 stars 29 forks source link

Changing settings from NativeMessaging #126

Open Godnoken opened 1 year ago

Godnoken commented 1 year ago

Hello again,

Is it possible to change settings like 'Threads', 'Memory per thread' and 'Translation cache' through NativeMessaging or/also when initiating the process?

If not, I'd like to make a feature request for it. I'm looking into it myself at the moment but I have very little proficiency in C++ so it may take a while.

jelmervdl commented 1 year ago

This is currently not implemented, but does partially exist in a branch: https://github.com/XapaJIaMnu/translateLocally/tree/nativemsg-configure-cmd. Only cache size and threads can be changed there, but adding fields for the number of cache mutexes is an option. Changing the amount of memory marian is allowed to use for its workspace could be added in a similar way as the number of threads is already implemented.

See the message format here: https://github.com/XapaJIaMnu/translateLocally/blob/cf92a04e414dde9d5ffc8e65129a7d6e1dbb04eb/src/cli/NativeMsgIface.h#L239-L264

And an example of how this is used in the web extension: https://github.com/jelmervdl/translatelocally-web-ext/blob/main/src/background/TLTranslationHelper.js#L95-L98

If I may ask, what's your use case? I've so far only used it for benchmarking. My idea has been that ideally TranslateLocally would be able to figure out the optimum values by itself.

Godnoken commented 1 year ago

This is currently not implemented, but does partially exist in a branch: https://github.com/XapaJIaMnu/translateLocally/tree/nativemsg-configure-cmd. Only cache size and threads can be changed there, but adding fields for the number of cache mutexes is an option. Changing the amount of memory marian is allowed to use for its workspace could be added in a similar way as the number of threads is already implemented.

Brilliant, I'll give it a go. Is this something you would want to merge into main?

If I may ask, what's your use case? I've so far only used it for benchmarking. My idea has been that ideally TranslateLocally would be able to figure out the optimum values by itself.

I'm building an application that utilizes translation, where I would like to offer more advanced options for the end user. Automatic configuration would be ideal indeed, if that's possible. Are there known optimal values for maximum translation speed that are easily figured out based on CPU speed, threads, memory etc?

jelmervdl commented 1 year ago

Brilliant, I'll give it a go. Is this something you would want to merge into main?

I suppose it should. I do want to make it a bit more robust before that, i.e. be okay with TranslateLocally adding more options without breaking when you don't provide these newer options. And make sure that if you change settings it handles the queue correctly and doesn't lose translation requests because of the restart. (Not sure if it does now, but that wasn't something that could happen in the use case I had for it.)

Are there known optimal values for maximum translation speed that are easily figured out based on CPU speed, threads, memory etc?

I thought so, but maybe not. It depends also on the application. For example, if you have more memory you can increase the batch size, which will increase throughput. But it will also increase latency a lot. So good for batch translation, less so for interactive ones. Similarly, more threads can decrease latency again for a bit, but if you're just translating single sentences (because the rest is already in cache) like TranslateLocally it doesn't have much benefit.

Godnoken commented 1 year ago

I suppose it should. I do want to make it a bit more robust before that, i.e. be okay with TranslateLocally adding more options without breaking when you don't provide these newer options. And make sure that if you change settings it handles the queue correctly and doesn't lose translation requests because of the restart. (Not sure if it does now, but that wasn't something that could happen in the use case I had for it.)

I see. I haven't had the time to try the branch yet but I am still interested to use & develop that part, however, I likely won't have time to do anything for quite some time.

Another question though. I assume most translation config params from https://github.com/browsermt/marian-dev/blob/69e27d298419a2ff0e24ea7c43cad997fa8230c0/src/common/config_parser.cpp would be implemented roughly the same way?

I thought so, but maybe not. It depends also on the application. For example, if you have more memory you can increase the batch size, which will increase throughput. But it will also increase latency a lot. So good for batch translation, less so for interactive ones. Similarly, more threads can decrease latency again for a bit, but if you're just translating single sentences (because the rest is already in cache) like TranslateLocally it doesn't have much benefit.

Interesting. Thank you!