Closed graemenail closed 1 year ago
emscripten was failing because I hadn't updated the bindings for WASM. I've fixed that, but only to restore support for single models. I can look at supporting ensembles here too if needed.
The bindings only take in a single alignedmemory, so currently it just forwards that.
At the point where I'm passing multiple models via the bindings, then it's basically enabled for wasm too. I'll have a look
I've checked the code is working after the latest changes. I used the regression test apps to check functionality
blocking \
--log-level trace \
--bergamot-mode test-forward-backward \
--model-config-paths \
ensemble_enes.yml \
ensemble_esen.yml
Examples of the new logging messages introduced here:
When loading any npz file:
[2023-07-30 16:58:12] Encountered an npz file model.esen.npz; will use file loading for 4 models
When loading from memory:
[2023-07-30 16:58:12] Loaded model 1 of 2 from memory
[2023-07-30 16:58:12] Loaded model 2 of 2 from memory
When loading from file:
[2023-07-30 16:58:13] Loaded 4 model(s) from file
Just to confirm, you see the increased runtime and actual different translations/scores?
Just to confirm, you see the increased runtime and actual different translations/scores?
Yes, there are different outputs and durations.
Model-Single: If you use multiple models, it should be obvious that the decoding time is increasing.
real 0m4.500s
user 0m2.982s
sys 0m1.366s
Model-Ensemble: If you use several models, it should be obvious that the decoding time increases.
real 0m15.184s
user 0m9.733s
sys 0m5.246s
(output is the resulting translation to English from a single sentence provided in the source language)
Model-Single is one of 4 teachers from the Model-Ensemble. Timing of then ensemble is roughly 4x the single. Enabling logging shows that the models are loaded as scorers.
Adds the ability to use ensembles of models. This supports ensembles of binary- or npz-format models, as well as mixtures of both.
When all models in the ensembles are of binary format, the load from memory path is used. Otherwise, they are loaded via the file system. Enable log-level debug for output related to this.