patrick-kidger / NeuralCDE

Code for "Neural Controlled Differential Equations for Irregular Time Series" (Neurips 2020 Spotlight)
Apache License 2.0
611 stars 67 forks source link

underflow error #2

Closed rubick1896 closed 4 years ago

rubick1896 commented 4 years ago

I have been getting "underflow in dt 0.0" after a few epochs of training. I am using Adam and lr of 1e-5(decreased from 1e-3, still not working). Any idea why this is happening? Do you have any suggestions to avoid this type of error?

cam1681 commented 4 years ago

I have been getting "underflow in dt 0.0" after a few epochs of training. I am using Adam and lr of 1e-5(decreased from 1e-3, still not working). Any idea why this is happening? Do you have any suggestions to avoid this type of error?

This type of error is caused by "blow up" of the ordinary equations. And if you use adam methods, the step size will become too small to continue, and the code will return you the "underflow" error.

rubick1896 commented 4 years ago

size will become too small to conti

What exactly does "blow up" mean? What can I do to prevent this error from happening?

cam1681 commented 4 years ago

size will become too small to conti

What exactly does "blow up" mean? What can I do to prevent this error from happening?

It is the problem of the stability of ordinary equations (I will call it ODEs for simplicity), so if you want to prevent it, here are two common ways: (1) make ODEs more stable, sorry for that I am busy now, and you can use "blow up ODEs" as the keywords in google, and the results in the first page may give you a rough comprehension. (2) If you use fixed step numerical methods (RK4 etc), even the number becomes too large, "dt" will not become too small to continue. But the side effect is that it will return you NaN if you have large integrating intervals. Maybe you can avoid some points or set your integrate interval not too large. We have met the same problem, in our work "https://arxiv.org/pdf/2005.04849.pdf", we use small interval and RK4 to avoid the blow up and "dt" too small. (We don't talk too much the stability in the article, so just for an ad :)

patrick-kidger commented 4 years ago

First of all, I just want to make clear that this isn't an issue with the optimiser, or the learning rate, despite what the previous commenter says. I don't think I believe their comments about numerical methods either, but that is at least related to how to fix this.


In terms of what's going wrong - you're solving the CDE with an adaptive solver that accepts very little error. Whatever CDE you have defined, however, is too hard to solve whilst only making that small an error, and this underflow appears as a result.

So, to fix this, you've got a few options.

Option 3 is actually what we use in the paper, where we take the step size to equal the smallest gap between observations.


It's worth noting that neither the problem nor the solutions are specific to Neural CDEs. They all apply to the original Neural ODEs as well! If you've used the torchdiffeq library before then the the arguments used in options 2 and 3 should seem familiar, as it's what you would do in that library as well.