-
This issue brings up a problem about using so-called "bootstrap resampling test" for evaluating "statistical significance" of machine translation (especially neural MT) methods, and similar generation…
-
The problem of `ListLink` (as well as `SetLink`) is that it allows to construct atoms with an arbitrary large outgoing set. There are 2 problems with that
1. Dealing with arbitrary large outgoing set…
-
I want to train the NLLB model, as instructed by the data [ReadMe](https://github.com/facebookresearch/fairseq/tree/nllb/examples/nllb/data) documentation, I have tried the filtering pipeline and got …
-
Thanks for all of these models! Sometimes it works comparable with Google Translate!
I noticed that you improve a model for French and several other languages. Do you have plans to do the same for …
-
There are a few improvements that could be made to the transformer implementation:
1. We have not confirmed that it gets competitive performance with the original implementation, nor created a reci…
-
There are several fringe that still bugs the MT evaluation metrics in `nltk.translate`.
The BLEU related issues are mostly resolved in #1330. But similar issues happens in RIBES and CHRF too:
…
-
Hello.
So, I want to run NLLB-200 (3.3B) model on a server with 4x 3090, and a say, 16 core AMD Epyc cpu.
I wrapped Ctranslate2 in fastAPI, running with uvicorn, inside a docker container with GPU …
-
In [this branch](https://github.com/rsennrich/nematus/tree/floatX), I removed all hardcoded references to float32 and I tried to train with float16, but it does not work:
Using cuDNN version 5105 o…
-
Among open issues, we have (not an exhaustive list):
- #135 complains about the sentence tokenizer
- #1210, #948 complain about word tokenizer behavior
- #78 asks for the tokenizer to provide offsets …
-
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
Taku Kudo, John Richardson
Accepted as a demo paper at EMNLP2018
https://arxiv.org/…