Closed thomasZen closed 2 years ago
This command seems to do the trick, at least for the example I cared about:
echo "d'action" | ~/GitRepos/mosesdecoder/scripts/tokenizer/tokenizer.perl -l fr -no-escape
In the comment above I used a different character (’
instead of '
), which confused me initially.
Hi,
I tried to use this repository to score my own hypothesis, but couldn't reproduce the tokenization you used. Specifically, the
util_preprocessor.py
called here is not present in this repository (I think).When using moses for tokenization I get a different tokenization compared to the reference in the test data. Here's an example:
Can you share the code you used for tokenization or which toolkit you used?
Thanks for this repository, Thomas