kevinduh / sockeye-recipes

Training scripts and recipes for Sockeye Neural Machine Translation toolkit
37 stars 18 forks source link

What is the right way to use translate.sh? #29

Open Crista23 opened 4 years ago

Crista23 commented 4 years ago

I am currently calling translate.sh with the following arguments: bash ~/sockeye-recipes/scripts/translate.sh -p anon_glove_zhanglapata.hpm -i data/newsela_Zhang_Lapata_splits/V0V4_V1V4_V2V4_V3V4_V0V3_V0V2_V1V3.aner.ori.test.src -o output/bpe.1best_greedy -e sockeye_gpu -b 1

and I get the message below:

translate.py: error: argument --output-type: expected one argument Traceback (most recent call last): File "/srv/disk01/ggarbace/TSGen/sockeye-recipes/tools/subword-nmt//apply_bpe.py", line 308, in BrokenPipeError: [Errno 32] Broken pipe Exception ignored in: <_io.TextIOWrapper name='' encoding='utf-8'> BrokenPipeError: [Errno 32] Broken pipe End translating: 2020-03-06 15:10:32 on zen

After adding --output-type as argument, I get: Usage: translate.sh -p hyperparams.txt -i input -o output -e ENV_NAME [-d DEVICE] [-s] Input is a source text file to be translated Output is filename for target translations ENV_NAME is the sockeye conda environment name Device is optional and inferred from ENV -s is optional and skips BPE processing on input source

I don't think I am missing any of the mandatory parameters. What is going wrong here? Thanks!

kevinduh commented 4 years ago

There is no --output-type flag in ~/sockeye-recipes/scripts/translate.sh. Your original call to the script is correct.

From the log, it looks subword-nmt//apply_bpe.py with broken pipe is the culprit. As a quick sanity check, see if you can translate one sentence from the training set. My guess is some line (or character) in your input file V0V4_V1V4_V2V4_V3V4_V0V3_V0V2_V1V3.aner.ori.test.src is causing subword-nmt//apply_bpe.py to fail. To confirm that, you can run subword-nmt//apply_bpe.py on your file separately and check if the number of output lines match the number of input lines.