XiangLi1999 / Diffusion-LM

Diffusion-LM
Apache License 2.0
1.02k stars 133 forks source link

How to deal with sequences with different lengths? #41

Open sunhaozhepy opened 1 year ago

sunhaozhepy commented 1 year ago

Thank you for your great work! I've read your paper and am having trouble understanding generating sequences with different lengths. It seems to me that as you fix n=64 in experiments, you can't change it anymore as the hidden size d'=n*d in Transformer is fixed. As a result, it should be impossible during inference time to generate sequences with length other than 64...?

XiangLi1999 commented 1 year ago

Hi,

Thanks for the question!

We can generate sentences shorter than length 64 via padding. If the training script sets --padding_mode pad, then the format will be [BOS][SENTENCE][EOS][PAD][PAD][PAD]...

You could try decoding from this model by running batch_decode.py on https://drive.google.com/drive/folders/110CA22rwu_3EcllPYGhql0TnYeOBY77d
and you will observe the padding pattern.