About argmax decoding - Githubissues

HKUNLP / reparam-discrete-diffusion

Reparameterized Discrete Diffusion Models for Text Generation

Apache License 2.0

90 stars 2 forks source link

Hi, thanks for being interested in the work!

We provided scripts to reproduce the experiment results of RDMs in fairseq/experiments, where argmax-decoding = True is used for machine translation (here) and temperature = 0.3 for question generation and paraphrasing tasks (here). We also found using a low temperature like 0.1 or 0.2 could achieve similar results to argmax-decoding for translation tasks, although there may be some fluctuations.

We adopt the sampling formulation in the pseudo-code as the argmax case can be included in the formulation when the temperature approaches 0, wherein the distribution would become a point mass on the token with the highest probability and sampling would be equivalent to taking the argmax.

HKUNLP / reparam-discrete-diffusion

About argmax decoding #2