HKUNLP / reparam-discrete-diffusion

Reparameterized Discrete Diffusion Models for Text Generation
Apache License 2.0
90 stars 3 forks source link

about decoding topk_masking #3

Open violet-sto opened 4 months ago

violet-sto commented 4 months ago

Hi

Thanks for your excellent work. I have a question about the rate schedule for topk_masking.

As described in the appendix, "To ensure that the degree of noise decreases as the generation process proceeds, we schedule k to increase from 1 to N monotonically as the diffusion step t goes from T to 1." However, in the code (https://github.com/HKUNLP/reparam-discrete-diffusion/blob/26ee286b281edc6284d74f809465b3e6d42507a6/discrete_diffusion/discrete_diffusions/discrete_diffusion_base.py#L177), the masked k tokens with the lowest confidence instead of the highest. Are there any inconsistencies here?

Best regards

LZhengisme commented 4 months ago

Hi there,

Thanks for reaching out! In https://github.com/HKUNLP/reparam-discrete-diffusion/blob/26ee286b281edc6284d74f809465b3e6d42507a6/discrete_diffusion/discrete_diffusions/utils.py#L20-L24, the topk_masking function actually returns a mask to indicate the unselected elements. This is essentially the inverse of selecting the highest elements; we implement this way to simplify subsequent calculations for denoising tokens.

Hope this clears things up!