Is it possible to output top-k predictions?

JasonYCHuang commented 6 years ago

If I provide one sentence, is it possible to inference top-k predictions, not only the best prediction?

Looks like this work provides this function, but I didn't success on it, and always get only the best prediction.

I put only one sentence in /tmp/inference/one.rct, and set --num_translations_per_input=5 in inference mode, but get only one prediction, not 5 candidates.

I use tensorflow-nightly on the master branch, commit 365e73.

Could you give me some directions?

training script

python -m nmt.nmt \
    --attention=scaled_luong \
    --src=vi --tgt=en \
    --vocab_prefix=/tmp/data/token  \
    --train_prefix=/tmp/data/train \
    --dev_prefix=/tmp/data/valid  \
    --test_prefix=/tmp/data/test \
    --out_dir=/tmp/model-top-k \
    --num_train_steps=10000 \
    --steps_per_stats=100 \
    --num_layers=2 \
    --num_units=1024 \
    --dropout=0.2 \
    --metrics=bleu \
    --optimizer=sgd \
    --learning_rate=1.0 \
    --start_decay_step=5000 \
    --decay_steps=10 \
    --encoder_type=bi \
    --beam_width=10

inference script

python -m nmt.nmt \
    --out_dir=/tmp/model-top-k \
    --inference_input_file=/tmp/inference/one.rct \
    --inference_output_file=/tmp/inference/one_pred.prd \
    --num_translations_per_input=5 \
    --beam_width=10

ttrouill commented 6 years ago

Same issue here.

ttrouill commented 6 years ago

I did a dirty fix by adding:

hparams.num_translations_per_input = flags.num_translations_per_input hparams.beam_width = flags.beam_width

in nmt.py, in run_main(.) just after hparams is loaded, it works just fine.

JasonYCHuang commented 6 years ago

Thanks @ttrouill It also works from my site.

Looks like: inference uses hparams from training, and it is not updated for inference params. https://github.com/tensorflow/nmt/blob/master/nmt/nmt.py#L577

Are you going to create a PR? I can create one if you are not available. Let me know your preference.

ttrouill commented 6 years ago

Go ahead :) Though it should probably be something cleaner and more general (to pass other unpassed parameters too) than what I proposed.

nashid commented 4 years ago

@ttrouill @JasonYCHuang are you still planning to make the PR?

tensorflow / nmt

Is it possible to output top-k predictions? #237