browsermt / bergamot-translator

Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.
http://browser.mt
Mozilla Public License 2.0
341 stars 38 forks source link

Simplify cache config and bind for use in JS #359

Closed jerinphilip closed 2 years ago

jerinphilip commented 2 years ago

Deprecates cacheEnabled parameter to be replaced with cacheSize=0. Python bindings, Documentation in comments and tests updated to reflect this change.

Exposes the fields corresponding to cache via embind as a value object. The equivalent object-based syntax in worker.js allows propagation from JS.

Fixes: #351 See also: mozilla/firefox-translations#96


The below screenshots are to demonstrate that cache is active and provide speedups in warm (exaggerated, but sufficient to demonstrate these changes work).

Cold 117 WPS

image

Warm 5284 WPS

image
abhi-agg commented 2 years ago

@jerinphilip I assume cacheEnabled should be set to false if one doesn't want to enable cache. However, what should be provided for cacheSize in this case (because now the API expects both these fields to be populated)?

We can do everything with just cacheSize (and eliminate cacheEnabled) where a 0 value will mean no cache and any +ve value enables cache with given size.

jerinphilip commented 2 years ago

However, what should be provided for cacheSize in this case (because now the API expects both these fields to be populated)?

It doesn't matter whatever's provided, the cache storage is not constructed. The notion is to skip caching code altogether. The semantics map to an std::optional<Cache>, with the bool flag deciding the optional and size determining the size of the container.

Edit: Ignore the accidental button click close.

jerinphilip commented 2 years ago

We can do everything with just cacheSize (and eliminate cacheEnabled) where a 0 value will mean no cache

I'm on the fence regarding this suggestion, this appears tempting to do from an API perspective. But there are also aspects that make this less than ideal. Wondering how natural it is for someone to know size=0 implies no-cache - otherwise, we'll need to rely on documenting this behaviour? If it's not obvious - I'd like to go with the additional bool parameter. Are there other reference implementations on the web which do this? L4 appears to have an assert > 0 somewhere.

kpu commented 2 years ago

Seems natural that cache size 0 means no cache. Really all you are doing is optimizing that special case.