Open rmira-sony opened 1 year ago
cool, thanks for reporting.
does the same thing happen if you use .VideoReader
and .AudioReader
separately?
Hi, I am running into memory leaks using VideoReader, specifically using get_batch()
. I am trying to multithread but the same happens singlethreaded.
tracemalloc
is pointing to decord/_ffi/ndarray.py
, specifically asnumpy()
:
np_arr = np.empty(shape, dtype=dtype)
assert np_arr.flags['C_CONTIGUOUS']
data = np_arr.ctypes.data_as(ctypes.c_void_p)
The amount of memory used increments by the size of np_arr
, so it seems as though the garbage collector is not doing what it needs to? I'm not sure. I'd be happy to help but got stuck on this.
Hi, I am running into memory leaks using VideoReader, specifically using
get_batch()
. I am trying to multithread but the same happens singlethreaded.
tracemalloc
is pointing todecord/_ffi/ndarray.py
, specificallyasnumpy()
:np_arr = np.empty(shape, dtype=dtype) assert np_arr.flags['C_CONTIGUOUS'] data = np_arr.ctypes.data_as(ctypes.c_void_p)
The amount of memory used increments by the size of
np_arr
, so it seems as though the garbage collector is not doing what it needs to? I'm not sure. I'd be happy to help but got stuck on this.
I meet the same issue. #323
Hi,
Long-time user here, love the package, kudos to the contributors :)
I was trying out AVReader since I work with audiovisual data loaders. However, I've found that it leaks memory, aka using it rather than VideoReader results in climbing system memory usage until the program eventually crashes. To be clear, this does not happen with VideoReader using the same code on the same system.
Here's the wandb plot on system memory usage (pink is AVReader, orange is VideoReader):
To be more precise, the code for the pink line is:
and the code for the orange line is:
Took me a while to diagnose this so hoping it can help solve this issue. Unfortunately, I'm not really familiar enough with the code to suggest a solution via pull request, so for now I'll stick to the VideoReader. Thanks for reading!
PS: This happens with workers>0 and also with workers=0, so the root is probably not related to the usual multiprocessing conflicts in pytorch dataloaders. PPS: I'm using the latest version of decord, torch and torchaudio. I am using a standard torch dataset class and dataloader with 8 workers.