XiangLi1999 / Diffusion-LM

Diffusion-LM
Apache License 2.0
1.02k stars 133 forks source link

Is it possible to pre-train a diffusion language model on Wikipedia texts? #30

Open RefluxNing opened 1 year ago

RefluxNing commented 1 year ago

I see some "simple_wiki" in the code. Did you intent to train a diffusion LM on wiki? Is it too difficult (require more computing resources, for example) to pre-train such a model?

XiangLi1999 commented 1 year ago

I think you can pre-train such a model. I have tried this in my preliminary experiments. You will get reasonable outputs, but it indeed takes quite long to train til converge.

RefluxNing commented 1 year ago

Actually I tried, but I got "Cannot allocate memory" error when using your code during preprocessing. Would you please share your model checkpoints pretrained on wiki?

XiangLi1999 commented 1 year ago

sorry, I dont have a good checkpoint, this was a preliminary experiment that I did in the early stage of the project, and the model architecture was also a bit different. But I will look into your problem when I got a chance. Could you provide detailed error msg?

RefluxNing commented 1 year ago

image I saw this error when I tried to train a model on my own text file (100MB).