During the run of the UniTS_pretrain_x128.sh script the loss value encountered nan.

mims-harvard / UniTS

A unified multi-task time series model.

https://zitniklab.hms.harvard.edu/projects/UniTS/

MIT License

363 stars 45 forks source link

During the run of the UniTS_pretrain_x128.sh script the loss value encountered nan. #17

Open linxi20 opened 2 months ago

linxi20 commented 2 months ago

Hello, Thank you for your contributions. I tried to run the UniTS_pretrain_x128.sh script, but after a while, the outputs appeared to be nan, and the corresponding loss value also changed to nan. But reducing the d_model to 64 there is no problem. What is the reason for this?

gasvn commented 2 months ago

That happens sometimes because the co-training on cross-domain datasets is not always stable, and it happens not only for time series but also on foundation models on other fields. We rerun the experiments when we find nan. You can also adjust the learning rate and use a smaller grad-clip value.