question about how to use topp sampling?

when trying task gigaword, i have the bug below:

UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). unfin_idx = bbsz_idx // beam_size ../aten/src/ATen/native/cuda/MultinomialKernel.cu:214: sampleMultinomialOnce: block: [4,0,0], thread: [0,0,0] Assertion sum > accZero failed.

my code: python3 -m torch.distributed.launch --nproc_per_node=${GPUS_PER_NODE} --master_port=${MASTER_PORT} ../../evaluate.py \ ${data} \ --path=${path} \ --user-dir=${user_dir} \ --bpe=bert \ --task=gigaword \ --batch-size=16 \ --log-format=simple --log-interval=10 \ --seed=7 \ --gen-subset=${split} \ --results-path=${result_path} \ --sampling \ --sampling-topk 10 \ --sampling-topp 0.7 \ --beam=6 \ --lenpen=0.7 \ --max-len-b=32 \ --no-repeat-ngram-size=3 \ --fp16 \ --num-workers=0 \ --model-overrides="{\"data\":\"${data}\",\"bpe_dir\":\"${bpe_dir}\",\"selected_cols\":\"${selected_cols}\"}"

OFA-Sys / OFA

question about how to use topp sampling? #358