Open vodnikss opened 5 months ago
Hello! I've encountered the same issue. It's possible that there's an error in the ctype component of the code when reducing the number of references to the object.
I managed to track down the issue with "immortal" objects using tracemalloc. When running similar code with a loop of 100,000 iterations, you can see the following top10 list:
[ Top 10 ]
/miniforge3/envs/env/lib/python3.11/ctypes/__init__.py:512: size=30.5 MiB, count=200001, average=160 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_dlpack.py:141: size=19.1 MiB, count=200014, average=100 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_dlpack.py:135: size=13.0 MiB, count=100019, average=136 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_dlpack.py:140: size=13.0 MiB, count=100000, average=136 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_dlpack.py:137: size=13.0 MiB, count=100000, average=136 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_shared_memory_tensor.py:65: size=8597 KiB, count=200000, average=44 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_shared_memory_tensor.py:75: size=6250 KiB, count=100000, average=64 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_shared_memory_tensor.py:74: size=6250 KiB, count=99999, average=64 B
/miniforge3/envs/env/lib/python3.11/site-packages/torch/utils/dlpack.py:121: size=25.0 KiB, count=483, average=53 B
/miniforge3/envs/env/lib/python3.11/tracemalloc.py:505: size=1400 B, count=25, average=56 B
It's evident that there are objects whose quantity corresponds to the number of iterations.
The issue was resolved by removing the following line in _dlpack.py:
# Use as managed context in DLPack that doesn't hold ownership of the
# data content.
class DataViewContext:
def __init__(self, shape) -> None:
# Convert the Python object to ctypes objects expected by
# DLPack
self._shape = (ctypes.c_int64 * len(shape))(*shape)
# No strides: compact and row-major
self._strides = ctypes.POINTER(ctypes.c_int64)()
def as_manager_ctx(self) -> ctypes.c_void_p:
py_obj = ctypes.py_object(self)
py_obj_ptr = ctypes.pointer(py_obj)
ctypes.pythonapi.Py_IncRef(py_obj)
# ctypes.pythonapi.Py_IncRef(ctypes.py_object(py_obj_ptr)) # problem line
return ctypes.cast(py_obj_ptr, ctypes.c_void_p)
As a result, we get the following top10 list:
[ Top 10 ]
/miniforge3/envs/env/lib/python3.11/site-packages/torch/utils/dlpack.py:121: size=25.2 KiB, count=486, average=53 B
/miniforge3/envs/env/lib/python3.11/ctypes/__init__.py:512: size=19.4 KiB, count=125, average=159 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_dlpack.py:141: size=14.7 KiB, count=138, average=109 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_dlpack.py:135: size=11.1 KiB, count=81, average=141 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_dlpack.py:140: size=8432 B, count=62, average=136 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_dlpack.py:137: size=8432 B, count=62, average=136 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_shared_memory_tensor.py:65: size=5456 B, count=124, average=44 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_shared_memory_tensor.py:75: size=3968 B, count=62, average=64 B
/miniforge3/envs/env/lib/python3.11/site-packages/tritonclient/utils/_shared_memory_tensor.py:74: size=3968 B, count=62, average=64 B
/miniforge3/envs/env/lib/python3.11/tracemalloc.py:505: size=1400 B, count=25, average=56 B
Hello, a memory leak was detected when executing this code. The code was run on Python 3.10., triton-client 2.41.1, torch 2.1.2.
The leak occurs when the dlpack function is called in torch.from_dlpack(smt)