XiangLi1999 / Diffusion-LM

Diffusion-LM
Apache License 2.0
1.05k stars 135 forks source link

top_p parameter and scaling of timesteps #38

Open rabeeh-karimi opened 2 years ago

rabeeh-karimi commented 2 years ago

Hi thanks for sharing the codes. I wonder why there is top_p in the codes, the part you are adjusting the noise, also the scaling of timesteps here

    def _scale_timesteps(self, t):
        if self.rescale_timesteps:
            return t.float() * (1000.0 / self.num_timesteps)
        return t

Are these necessary ? thanks

XiangLi1999 commented 2 years ago

Hi,

Thanks for the questions. re 1: there is top_p in the code because I am curious about truncated sampling, similar to what people do in the Big GAN paper. It helps the sample quality a bit, and could be a interested sampling approach, but this experiment is not included in any results in the paper.

re 2: Technically this is not necessary, but this makes implementation easier, adoted from the openai codebase. You could turn it on or off, as long as you are consistent during training and inference.

Hope this helps!