rentruewang / koila

Prevent PyTorch's `CUDA error: out of memory` in just 1 line of code.
https://koila.rentruewang.com
MIT License
1.82k stars 63 forks source link

Compatibility with PyTorch hooks. #15

Closed feifeibear closed 2 years ago

feifeibear commented 2 years ago

Hello, I found this project is interesting. However, I found the lazy tensor mechanism is impossible to work with the PyTorch backward hooks, which makes it difficult to be used in combination with PyTorch checkpointing (https://pytorch.org/docs/stable/checkpoint.html). Checkpointing is a common way to avoid OOM in training.

rentruewang commented 2 years ago

Hi, sorry for the late reply!

It does seem that there's no way as of now to make this compatible with checkpointing. Perhaps in the future I'll figure a way to do this. I'll leave this issue open in case something comes up.

feifeibear commented 2 years ago

Thanks! good job anyway. I closed the issue.