Open Godnoken opened 1 year ago
This is currently not implemented, but does partially exist in a branch: https://github.com/XapaJIaMnu/translateLocally/tree/nativemsg-configure-cmd. Only cache size and threads can be changed there, but adding fields for the number of cache mutexes is an option. Changing the amount of memory marian is allowed to use for its workspace could be added in a similar way as the number of threads is already implemented.
See the message format here: https://github.com/XapaJIaMnu/translateLocally/blob/cf92a04e414dde9d5ffc8e65129a7d6e1dbb04eb/src/cli/NativeMsgIface.h#L239-L264
And an example of how this is used in the web extension: https://github.com/jelmervdl/translatelocally-web-ext/blob/main/src/background/TLTranslationHelper.js#L95-L98
If I may ask, what's your use case? I've so far only used it for benchmarking. My idea has been that ideally TranslateLocally would be able to figure out the optimum values by itself.
This is currently not implemented, but does partially exist in a branch: https://github.com/XapaJIaMnu/translateLocally/tree/nativemsg-configure-cmd. Only cache size and threads can be changed there, but adding fields for the number of cache mutexes is an option. Changing the amount of memory marian is allowed to use for its workspace could be added in a similar way as the number of threads is already implemented.
Brilliant, I'll give it a go. Is this something you would want to merge into main?
If I may ask, what's your use case? I've so far only used it for benchmarking. My idea has been that ideally TranslateLocally would be able to figure out the optimum values by itself.
I'm building an application that utilizes translation, where I would like to offer more advanced options for the end user. Automatic configuration would be ideal indeed, if that's possible. Are there known optimal values for maximum translation speed that are easily figured out based on CPU speed, threads, memory etc?
Brilliant, I'll give it a go. Is this something you would want to merge into main?
I suppose it should. I do want to make it a bit more robust before that, i.e. be okay with TranslateLocally adding more options without breaking when you don't provide these newer options. And make sure that if you change settings it handles the queue correctly and doesn't lose translation requests because of the restart. (Not sure if it does now, but that wasn't something that could happen in the use case I had for it.)
Are there known optimal values for maximum translation speed that are easily figured out based on CPU speed, threads, memory etc?
I thought so, but maybe not. It depends also on the application. For example, if you have more memory you can increase the batch size, which will increase throughput. But it will also increase latency a lot. So good for batch translation, less so for interactive ones. Similarly, more threads can decrease latency again for a bit, but if you're just translating single sentences (because the rest is already in cache) like TranslateLocally it doesn't have much benefit.
I suppose it should. I do want to make it a bit more robust before that, i.e. be okay with TranslateLocally adding more options without breaking when you don't provide these newer options. And make sure that if you change settings it handles the queue correctly and doesn't lose translation requests because of the restart. (Not sure if it does now, but that wasn't something that could happen in the use case I had for it.)
I see. I haven't had the time to try the branch yet but I am still interested to use & develop that part, however, I likely won't have time to do anything for quite some time.
Another question though. I assume most translation config params from https://github.com/browsermt/marian-dev/blob/69e27d298419a2ff0e24ea7c43cad997fa8230c0/src/common/config_parser.cpp would be implemented roughly the same way?
I thought so, but maybe not. It depends also on the application. For example, if you have more memory you can increase the batch size, which will increase throughput. But it will also increase latency a lot. So good for batch translation, less so for interactive ones. Similarly, more threads can decrease latency again for a bit, but if you're just translating single sentences (because the rest is already in cache) like TranslateLocally it doesn't have much benefit.
Interesting. Thank you!
Hello again,
Is it possible to change settings like 'Threads', 'Memory per thread' and 'Translation cache' through NativeMessaging or/also when initiating the process?
If not, I'd like to make a feature request for it. I'm looking into it myself at the moment but I have very little proficiency in C++ so it may take a while.