Open vitonsky opened 3 years ago
New candidate https://github.com/browsermt/bergamot-translator
Research the https://github.com/OpenNMT/CTranslate2/issues/528
Hi, Thank you for your project!
I think your expectations regarding size and speed of translation engines might be a bit unrealistic, I use m2m_100 to translate subtitles with easynmt on a fast ryzen box and it takes up to 24gb of ram and only outputs about a sentence a second.. The speed you'd need to make this plug in usable would probably require GPU acceleration, and I don't think you can download 5GB models in a browser plug in and use 20+ GB of ram.
There are alternatives that use less ram and cpu but at the expense of accuracy / language support. There's a good table here under the heading "available models' https://github.com/UKPLab/EasyNMT
However, easynmt does provide a REST API that you can host on a fast machine / cloud with docker, it's under the title " Docker & REST-API" on that easynmt page. It would be awesome it you could support that!
Anyway take my recommendations with a grain of salt, I'm just an amateur in the world of ML trying to roll my own alternatives to Google :)
Thanks for your suggestion @noaho, i will research it.
I don't sure about models size. For example, i heard that https://github.com/browsermt/bergamot-translator is use models with size 20-80 mb.
At this time i not have knowledge about machine learning to explore code, and in near time i have no time to learn it, but when i will have time, i will work on this issue. I think it is possible to find a translator module which enough powerful and with resources requirements to translate pages in offline.
To avoid transfer your personal data in near future you could up your own any translate server and write your own translator module which will use its local API. But autonomic translator module must work when you common user and you have not internet, but you need to translate page in offline, any way (of course you can do it right now if previously already translate this page and it keep in cache).
Please, if you have contact with people who understand this, contact us. Invite them to see on this repository and this issue, maybe they will interesting and can help me to find best candidate or even want to maintenance in this project to make translator module on WASM. I really think that this plugin for translate is best, i make it because all other was not enough quality. My purpose is make translate pages and any other text in browser is easy, free and convenient to use, and maybe in future move it on your desktop. Maybe someone have interest to maintain in this project. At this time i need help with NMT
Hey, I noticed your tweets to the bergamot project!
If you want to try to implement a bergamot backend for Linguist, I'd suggest taking a look at the node.js test script. Unfortunately we don't have good documentation about the API yet. But this is the most concise example of how to use the wasm binary: https://github.com/browsermt/bergamot-translator/blob/main/wasm/node-test.js
You'll need the compiled wasm binary and the javascript helper functions: bergamot-translator-worker.wasm & bergamot-translator-worker.js. I'd suggest just grabbing them from the latest release for now.
If you do want to build it from source, I recommend using Docker, and running something like docker run --rm -v $(pwd):/src -w /src emscripten/emsdk:3.1.8 bash build-wasm.sh
. But that doesn't give you anything that's not already in the release.
Mozilla maintains a list of models in their extension: modelRegistry.js. They also have code for downloading & loading the models in translationWorker.js but it's not as concise.
I'm maintaining an experimental version of the firefox addon that's based on it. The meaty bits are WASMTranslationWorker.js and WASMTranslationHelper.js which you can also use for inspiration how to weave it all together. Do note that that uses a different model registery, namely that of translateLocally.
Speaking of translateLocally, that is another way to implement bergamot-translator! If you have translateLocally installed you can communicate with it through the native messaging api. Generally, it's much (much!) faster that way, and you don't have to worry about caching of the models or anything. Downside: you need translateLocally on the machine and you can't ship it as part of the extension.
You can use TLTranslationHelper.js or native_client.py as examples on how to use it. Note that translateLocally needs to know about your extension, otherwise Chrome/Firefox won't allow you to communicate with it. There is a bit of info about it in the README.md but this pull request is merged that should be a lot easier.
Edit: One major difference I noticed between how Mozilla's translation addon is implemented and Linguist is that Linguist only translates text nodes, while Mozilla passes in chunks of HTML. This allows for the translator to move the HTML around to make sure the markup follows any word reordering.
For websites with little inline HTML, your implementation is probably reasonably sufficient. But it's something to keep in mind. Translating and then merging back the translated HTML is a bit of a challenge and I don't know how many of your translation backends will support it. Google does through their format
key, but I don't know about the others.
@jelmervdl thank you for your responsiveness. I would happy to integrate a Bergamot translator into Linguist! I have few questions to start it.
As developers we have to rely on docs to interact between code modules. It's especially important for distributed teams. Documentation is our contract which ensure that we use module as planned and when behavior in documentation and in real life is not match, we may create bug report to fix this differences.
What's plans of Bergamot project about documentation? How soon developers will be ready to maintenance a docs? Is this plans exists?
For me interesting this topics
Where i can find an AI models and information about it (features, supported languages, performance, notes)?
Is we have some examples how it works in browser? Which browser features Bergamot required to work?
I heard that Bergamot project have speech processing features as "text to speak" and "speech to text", is it? If yes, what's status of this features? Where i can find information and documentation about it?
I would to make linguist full featured autonomic translator (as option, if it will works fine, then as option by default), so this features will useful too.
I seek to make the Linguist fast, so your opinion is important. You told that current Linguist approach may be sub optimal for large web pages. Could you explain why and what's your suggestions to fix it?
Right now a Linguist just take text of DOM nodes, translate it and replace values. This way is allow to interact with page as minimal as possible, to keep references to DOM nodes, which may be used to memoize rendering for JS frameworks (in cases when DOM node been replaced, the framework will re-render whole content of this node).
Feel free to suggest your vision about pros and cons
I will explore your links, sorry if some questions been answered already, let's just collect and organize all information which we have about it to simplify navigation for me and other developers in the future.
Re documentation:
For bergamot-translator there is no up-to-date documentation covering the full scope of C++, Python or Node API. And I don't expect that to change in the short term.
bergamot-translator is glue code that allowed us to use the decoding part (translation using a given translation model) in the Firefox extension: it is a wrapper around marian-nmt, exposing a much simpler API. It is still a pretty low level API though. I've tried to document it as practically as possible in the test script.
Re models:
Right now these are the two parties I know of that are training efficient models, and info about them:
These ones are optimised for the quality vs (size + speed) trade-off. In theory any Marian model is compatible. The University of Helsinki has a massive number of them.
Training your own models is an option: https://github.com/mozilla/firefox-translations-training. It is a bit daunting though, and often requires specific changes per language and dataset. And some trial and error.
Re browser:
See my point about documentation, but also my earlier comment. There are two working extensions that use the WASM API, there is the WASM demo code in bergamot-translator, and the annotated test script I just linked. It should be sufficient to piece together how to use it.
Re speech processing:
This is not a feature of this project as far as I'm aware. I think your best bet for that right now is the Web Speech API. That's not guaranteed to be offline and private, unfortunately.
Re performance:
Sorry for the misunderstanding. It was not about performance, but quality. I was trying to explain that bergamot-translator can re-order words For example, "Jeden Tag esse ich Pizza" translates to "Every day I eat pizza", and in "esse ich" and "I eat" the order is reversed.
Now imagine that there is HTML wrapping that bit: <em>Jeden Tag esse</em> ich Pizza
. Linguist translates that in two steps: Jeden Tag esse
and ich Pizza
. And you will likely get something like "Every day eats" and "I pizza". Still understandable, but could be better in terms of quality.
bergamot-translator accepts the whole sentence (or multiple, it doesn't care) with markup and tries to maintain that markup as words are moved around:
But this has exactly the difficulty you mention: it now becomes tricky to match that translated HTML/text with the text nodes that are already in the DOM tree. The Firefox extension try to do this, but that code is complicated and probably a lot slower than just translating text nodes.
Ignore the bit about translateLocally and native messaging for now. If translation speed is really important, that avenue is great. But even with the WASM module you can translate the visible bit of a page in seconds, and it doesn't require your users to download and install extra software.
facebook/meta model https://github.com/facebookresearch/fairseq/tree/nllb
200 languages, 300Gb, can't be used in browser, just for research
https://github.com/ggerganov/whisper.cpp
We can achieve this goal if we can port open-source translation models to this library.
@BrightXiaoHan it looks interesting, thanks for link, but it looks this project about speech recognition, but in current issue we try to find a projects about text translation with no internet. It is fine, your link useful too, because in future we have plans to implement input text with speech, but if you have relevant links to the topic, share it please.
I will edit the issue to improve description what problem we try to solve
@vitonsky What I mean is that this library implements the Transformer model and its decoding process in a single c file, which can be easily adapted to WASM. It should be easily adaptable to machine translation models. There is also an example of implementing the GPT2 model in their project.
@BrightXiaoHan oh, it is interesting then. I will research
Now Linguist uses an third party services to translate texts. It is not private, because users send their texts to a servers like Google, Yandex, Bing, etc.
We need to allow users to use offline translation that will not send their texts to internet and will process it local on user machine. It is technically possible right now with custom translators, we have guide about it, however it is not embedded solution "for geeks only", so most users will not use this way.
Thus, we have to implement embedded module to offline translate text.
Requirements to candidate