rentruewang / koila

Prevent PyTorch's `CUDA error: out of memory` in just 1 line of code.
https://koila.rentruewang.com
MIT License
1.82k stars 63 forks source link

RecursionError: maximum recursion depth exceeded while calling a Python object #3

Closed diggerdu closed 2 years ago

diggerdu commented 2 years ago
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 572, in lazy_forward
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 420, in __torch_function__
    return lazy_forward(func, shape_impl, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 572, in lazy_forward
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 420, in __torch_function__
    return lazy_forward(func, shape_impl, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 572, in lazy_forward
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 420, in __torch_function__
    return lazy_forward(func, shape_impl, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 572, in lazy_forward
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 420, in __torch_function__
    return lazy_forward(func, shape_impl, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 572, in lazy_forward
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 420, in __torch_function__
    return lazy_forward(func, shape_impl, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 572, in lazy_forward
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 420, in __torch_function__
    return lazy_forward(func, shape_impl, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 572, in lazy_forward
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 420, in __torch_function__
    return lazy_forward(func, shape_impl, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 572, in lazy_forward
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 408, in __torch_function__
    if not builtins.all(
  File "/opt/conda/lib/python3.8/site-packages/koila/tensors.py", line 409, in <genexpr>
    issubclass(typ, (LazyTensor, Tensor, int, float, bool)) for typ in types
  File "/opt/conda/lib/python3.8/abc.py", line 102, in __subclasscheck__
    return _abc_subclasscheck(cls, subclass)
RecursionError: maximum recursion depth exceeded while calling a Python object
rentruewang commented 2 years ago

Hi, thanks for this great bug report. I've not tested this code in evaluation mode.

This bug happens because torch.* functions will call __torch_function__ when evaluating LazyTensor, and because LazyTensor is designed to evaluate eagerly in evaluation mode, it causes an recursion error.

I don't think the fix in #4 is ideal, however, because the code should evaluate eagerly in evaluation mode (where OOM almost never happens because the graph does not need to be saved). So the 'correct' fix would be to run every arg and kwarg before passing them to torch.* functions.

Will fix it when I can, but as of right now, I'm going to close #4 .