louaaron / Score-Entropy-Discrete-Diffusion

[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
https://aaronlou.com/blog/2024/discrete-diffusion/
MIT License
352 stars 33 forks source link

Unstable training #8

Open Vishal-S-P opened 3 months ago

Vishal-S-P commented 3 months ago

Hi,

I am training SEED-small model on OpenWebText dataset. After few iterations of training the loss value (eval and train) explodes and training becomes unstable. Has anyone encountered this issue before? image