XiangLi1999 / Diffusion-LM

Diffusion-LM
Apache License 2.0
1.02k stars 133 forks source link

Where is the mbr.py file? #43

Closed smiles724 closed 1 year ago

smiles724 commented 1 year ago

Hi, I notice that when modality == 'e2e, the batch_decode.py calls the diffusion_lm/e2e_data/mbr.py file. However, I failed to find this script. Is there an error or typo?

    elif modality == 'e2e':
    COMMAND1 = f"python diffusion_lm/e2e_data/mbr.py {out_path2}"

    os.system(COMMAND1)
XiangLi1999 commented 1 year ago

Hi,

Thanks for the question!

I think modality='e2e' is used for conditional generation task of table-to-text, and this is some additional experiments we include in the repo, but not in the Diffusion-LM paper. So if you are trying to replicate experiments in the paper, you probably dont need to run this line. Try modality='e2e-tgt' to replicate the language modeling training of Diffusioin-LM on e2e dataset.

To answer this question, this script is MBR decoding for conditional table-text generation. For a code of very similar purpose, please refer to https://github.com/XiangLi1999/Diffusion-LM/blob/main/improved-diffusion/anlg_infill/mbr_eval.py and https://github.com/XiangLi1999/Diffusion-LM/blob/759889d58ef38e2eed41a8c34db8032e072826f4/improved-diffusion/anlg_infill/mbr_eval.py#L189.