mozilla / translations

The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
154 stars 33 forks source link

Fork bergamot-translator into this repostiory as inference #867

Closed nordzilla closed 1 month ago

nordzilla commented 1 month ago

This PR forks the https://github.com/browsermt/bergamot-translator repository into this repository, self-contained within the inference-engine directory.

It also modifies it in the following ways:

gregtatum commented 1 month ago

Locally the commits look like this:

c62bea0 2024-09-25 Erik Nordin (HEAD -> fork-bergamot, origin/fork-bergamot) Add review groups to CODEOWNERS
07e3216 2024-09-26 Erik Nordin Move build-wasm script to inference-engine/scripts directory
cf23bf7 2024-09-26 Erik Nordin Add clean script to inference-engine
bb47eed 2024-09-26 Erik Nordin Add unit-tests script to inference-engine
00f4a30 2024-09-26 Erik Nordin Add build-local script to inference-engine
b6906d9 2024-09-26 Erik Nordin Remove unneded doc code
1019fdd 2024-09-26 Erik Nordin Remove unneeded CLI code
27e85d2 2024-09-20 Erik Nordin Remove unneeded Python code
37d0113 2024-09-20 Erik Nordin Remove .circleci and .github files
3da08c9 2024-09-26 Erik Nordin Remove bergamot-translator-tests dependency
cad3963 2024-09-19 Erik Nordin Rename inference-engine/3rd_party/marian-nmt
bbb8442 2024-09-19 Erik Nordin Move inference-engine git submodules to the repository root
285cf6b 2024-09-26 Erik Nordin Add 'inference-engine/' from commit '9271618ebbdc5d21ac4dc4df9e72beb7ce644774'
77479b3 2024-09-24 Greg Tatum (origin/main, origin/HEAD, main) Update the statistics class to use data attributes and use Statistics in merge-mono.py (#853)

Then the merge commit looks like:

➤ g show 285cf6b
commit 285cf6ba4d1ff96e018944bf5abf348a3945a873
Merge: 77479b3 9271618
Author: Erik Nordin <enordin@mozilla.com>
Date:   Thu Sep 26 12:54:32 2024 -0500

    Add 'inference-engine/' from commit '9271618ebbdc5d21ac4dc4df9e72beb7ce644774'

    git-subtree-dir: inference-engine
    git-subtree-mainline: 77479b3a77703a281bdf8313e168ecf697a376e5
    git-subtree-split: 9271618ebbdc5d21ac4dc4df9e72beb7ce644774
gregtatum commented 1 month ago

Let's make sure to get @eu9ene's approval on this before landing.

nordzilla commented 1 month ago

Thanks for all the feedback.

I'll get this PR cleaned up further today, so that we can get this merged.

There's quite a lot of code. I didn't look at it much but I hope we'll be able to delete everything except the wasm bindings and related classes.

This is the ultimate goal.

But I need it all there to start, so that I can build up a good testing strategy, and ensure that regressions do not occur along the way.

There are quite a lot of small structural suggestions that should be addressed before we can merge to main. Since we decided to share documentation and lining infrastructure we should unify all this right away as it was one of the reasons why we're moving it to this repo.

I appreciate the feedback!

This is precisely why I would like to merge this into main sooner, rather than later, to ensure that we are all aligned on this kind of thing before continuing to make divergent changes on an independent branch.

nordzilla commented 1 month ago

I made a few of the changes, resolving the relevant comments for them.

I would prefer to follow up with the rest after this content is merged.

nordzilla commented 1 month ago

Made a few more of the requested changes.

I'd like to get this merged today and continue with more follow-ups/other work as PRs to the main branch.