Long-time user here, love the package, kudos to the contributors :)
I was trying out AVReader since I work with audiovisual data loaders. However, I've found that it leaks memory, aka using it rather than VideoReader results in climbing system memory usage until the program eventually crashes. To be clear, this does not happen with VideoReader using the same code on the same system.
Here's the wandb plot on system memory usage (pink is AVReader, orange is VideoReader):
To be more precise, the code for the pink line is:
Took me a while to diagnose this so hoping it can help solve this issue. Unfortunately, I'm not really familiar enough with the code to suggest a solution via pull request, so for now I'll stick to the VideoReader. Thanks for reading!
PS: This happens with workers>0 and also with workers=0, so the root is probably not related to the usual multiprocessing conflicts in pytorch dataloaders.
PPS: I'm using the latest version of decord, torch and torchaudio. I am using a standard torch dataset class and dataloader with 8 workers.
Hi,
Long-time user here, love the package, kudos to the contributors :)
I was trying out AVReader since I work with audiovisual data loaders. However, I've found that it leaks memory, aka using it rather than VideoReader results in climbing system memory usage until the program eventually crashes. To be clear, this does not happen with VideoReader using the same code on the same system.
Here's the wandb plot on system memory usage (pink is AVReader, orange is VideoReader):![image](https://github.com/dmlc/decord/assets/146049730/ea345ef2-1bd8-4f3c-b18e-6c543483a57b)
To be more precise, the code for the pink line is:
and the code for the orange line is:
Took me a while to diagnose this so hoping it can help solve this issue. Unfortunately, I'm not really familiar enough with the code to suggest a solution via pull request, so for now I'll stick to the VideoReader. Thanks for reading!
PS: This happens with workers>0 and also with workers=0, so the root is probably not related to the usual multiprocessing conflicts in pytorch dataloaders. PPS: I'm using the latest version of decord, torch and torchaudio. I am using a standard torch dataset class and dataloader with 8 workers.