rtqichen / torchdiffeq

Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.
MIT License
5.54k stars 922 forks source link

Google Colab session crashes with GPU but not CPU #167

Closed MaricelaM closed 3 years ago

MaricelaM commented 3 years ago

Hello there,

I am using torchdiffeq on Google Colab with the default settings for odeint/odeint_adjoint, I've tried both odeint_adjointand just odeint. When I run my code on a GPU on Google Colab, it crashes the colab session on the very first call to odeint/odeint_adjoint but weirdly this doesn't happen when running identical code on a CPU. For the CPU is runs no problem for all the epochs.

I switched to the implicit_adams method in odeint_adjoint and now the code runs on the GPU. I am wondering if anybody has some insight into what might be happening.

Thank you for reading!

MaricelaM commented 3 years ago

This seems to be caused by 2 things.

  1. Passing in a tensor of tuples into odeint (the tuple is a python native object and isn't on the gpu).
  2. in adaptive step solvers Perturb inside _runge_kutta_step causes this issue.

It would be cool to be able to pass in a tuple of tensor and run on GPU, which makes this a feature request

For 2. this seems like a bug.

rtqichen commented 3 years ago

Thanks for the bug report. Tuple of tensors should definitely be supported, so I'll look into fixing this asap.

Do you have a minimal reproducible example (runnable on colab) that I can take a look at? Also let me know the PyTorch and torchdiffeq versions.