-
Please could you tell me the features used to train the average perceptron model to parse the OSM addresses
-
Hi, I have been using `apply_bpe` from October 2016. I tested a recent copy of `apply_bpe` and the number of segments are significantly lower than before. I am using exactly the same settings and code…
-
As of now, it is possible to specify user defined symbol to by pass some sequences.
It would be great if we could pass a "pattern" for these symbol especially when we have plenty of placeholders.
…
-
Hi, i want to know why do you use byte pair encoding for code generation task to further sub-tokenise it ? can't we proceed further directly to nematus without byte-pair encoding?
-
At first, thank you very much for releasing such good tools for Seq2Seq problems.
Those days, I used tensor2tensor to translate Chinese to English. And I got two problems. I hope you can give some su…
-
hi, i'm using t2t to build my own chinese to english translation system with my own corpus
is there any suggestions for this work?
for examples, size of the corpus to train a practical system?
siz…
yuimo updated
6 years ago
-
I have recently found out [this](https://research.googleblog.com/2017/05/the-machine-intelligence-behind-gboard.html) post by google about the transliteration models implemented with `FST` encoder/dec…
-
Would it be possible to perform R2L rescoring with amunmt? How could I integrate amunmt into Rico's r2l translation script?
```
#!/bin/bash
# this sample script translates a test set, including…
-
Is it possible to translate a file consisting of one sentence per line?
I tried
```
./amun -c config.ens.yml -i source.txt > target.txt
```
but ended up with no translation. target.txt was…
-
./learn_bpe.py -s {num_operations} < {train_file} > {codes_file}
what is num_operations and how to set it?
I think it maybe the number of tokens in the final vocabulary, so I executed the above comm…
amirj updated
8 years ago