browsermt / marian-dev

Fast Neural Machine Translation in C++ - development repository
https://marian-nmt.github.io
Other
20 stars 7 forks source link

compile to wasm #5

Open kpu opened 4 years ago

kpu commented 4 years ago

@lhk writes in https://github.com/marian-nmt/marian/issues/343 and I've transferred it here.

Feature description

I would like to embed machine translation in my webapp. There is tensorflow.js but so far I've been unable to find suitable pretrained translation models.

Opus-MT hosts a large repository of pretrained models for many language pairs. It uses marian for the neural machine translation.

The pre- and postprocessing is cheap. I would be able to host the tokenizer on a server. But marian-decoder is too costly to host myself. It would be great if it was possible to compile the code to webassembly and run it client-side.

I have written small projects in C/C++ and in principle would be happy to dig deeper. But guidance from someone with more experience would be really helpful.

Is this feasible at all?

kpu commented 4 years ago

Marian is fast enough to run natively on consumer desktops.

WebAssembly has several limitations that make it slower than native: https://docs.google.com/document/d/1pTl4clEaMHj5n4P0oc5zBckHsK1et4zSARVBPNsQ__M/edit

  1. No 8-bit dot product instruction https://github.com/WebAssembly/simd/issues/328
  2. Only 128-bit SIMD, not 256 or 512-bit yet
  3. We don't know how many registers there are, which makes matrix multiply slow. Matrix multiply is constrained by memory bandwidth so routines use as many registers as possible to make tiles as large as possible.

We don't know the exact performance impact on Marian yet, but it is Mozilla's current job in the project to compile Marian to WebAssembly and find out; @mlopatka and @abhi-agg are the Mozilla people currently working towards this.

There's also a proposal to support machine learning better in the browser https://webmachinelearning.github.io/webnn/#api-neuralnetworkcontext-gemm though it's nascent.

lhk commented 4 years ago

@kpu that sounds awesome! Thanks for the quick response :)

I've taken a look at the browser.mt website and found this blogpost: https://browser.mt/blog/w3c-presentation-post "Client side Firefox translation demo"

That sounds as if there is already a working configuration. Is it possible to play around with that? I couldn't find the actual demo, only the slideset.

kpu commented 4 years ago

The demo is based on a local Marian server running in the background (from this repo) and a Firefox fork https://github.com/browsermt/firefox that communicate over a REST API on TCP. The fork is outdated and probably a security risk at this point. This is all meant to be replaced by either native messaging inside an extension or a pure web extension that would run on WebAssembly / WebGL / WebNN possibly with rapid implementation of proposed standards that make performance tolerable. And it is Mozilla's job to explore this space.

So the full answer to your question is you are welcome to play around with the code but don't expect much support or documentation at this time. You can also get a fast model from http://statmt.org/bergamot/models/ .

The current project exposes translation to the user and has a native API (think Marian server plus quality estimation and word alignment). We hadn't even thought of the use case of exposing a javascript API for web pages to access translation themselves until you stopped by.

mlopatka commented 4 years ago

As Kenneth mentions, our current goals include exploration of the performance characteristics alternatives to the client-server architecture described in @kpu's comment. At current time, we resourcing the development efforts to implement and assess both a WASM module and a native messaging solution, and @abhi-agg will be working with a (tbh) developer on those tasks. This thread is the right place to track progress.