XiangLi1999 / Diffusion-LM

Diffusion-LM
Apache License 2.0
1.02k stars 133 forks source link

train the diffusion model for sentence infilling #26

Closed lwmlyy closed 1 year ago

lwmlyy commented 1 year ago

Hi Lisa,

Thanks for releasing the code for your interesting research. I have tried to train a diffusion model using the following code

python scripts/run_train.py --diff_steps 2000 --model_arch transformer --lr 0.0001 --lr_anneal_steps 400000 --seed 101 --noise_schedule sqrt --in_channel 128 --modality roc --submit no --padding_mode pad --app "--predict_xstart True --training_mode e2e --vocab_size 11043 --roc_train ../datasets/ROCstory " --notes xstart_e2e --bsz 64

and used the following code to generate, but the model seems to output whatever I input to it. Is there something wrong with my procedure.

python scripts/infill.py --model_path diffusion_models/diff_roc_pad_rand128_transformer_lr0.0001_0.0_2000_sqrt_Lsimple_h128_s2_d0.1_sd101_xstart_e2e/  --eval_task_ 'infill' --use_ddim True --notes "tree_adagrad" --eta 1. --verbose yes --partial_seq "My dog loved tennis balls."

Also, I am curious about the detailed functions of the parameters listed above, could you please add some explanation?

XiangLi1999 commented 1 year ago

I think you need to insert PAD tokens in your partial_seq. In the code it's looking for these PAD tokens and running infilling only for those tokens.

For example: --partial_seq "My PAD loved tennis balls."