Closed sanjay-nit closed 1 month ago
After running it in LambdaLabs instance look at the Memory usage.
Can anybody help me with this?
@sanjay-nit solved ? :)
@sanjay-nit solved ? :)
Hi @felixdittrich92 After a whole day of debugging finally I found there was issue with garbage collector. 😃 So sorry I tagged you many times.
Basically I was using TemporaryDirectory() and just for testing I loaded 400 PIL images in list, and somehow after getting out of context manager gc was not able to clear the memory. I still don't know why it happened though I manually deleted all the variable and list but gc wasn't able to clear the memory.
finally I didn't store all the images in list rather I did ocr as one by one then it worked.
Thanks a lot @felixdittrich92 and sorry :)
@sanjay-nit solved ? :)
Hi @felixdittrich92 After a whole day of debugging finally I found there was issue with garbage collector. 😃 So sorry I tagged you many times.
Basically I was using TemporaryDirectory() and just for testing I loaded 400 PIL images in list, and somehow after getting out of context manager gc was not able to clear the memory. I still don't know why it happened though I manually deleted all the variable and list but gc wasn't able to clear the memory.
finally I didn't store all the images in list rather I did ocr as one by one then it worked.
Thanks a lot @felixdittrich92 and sorry :)
At the end good to see that you was able to solve it 👍 :)
Bug description
I'm running docTR in Google colab GPU(T4). GPU memory usage is constant but cpu usage is increasing insanely when run it in a loop for list of images.
FYI: I've already tried fixing with https://github.com/mindee/doctr/discussions/1422 but didn't help.
I'm using doctr pytorch
!pip install "python-doctr[torch]"
I'm using below envs
I've also tried https://github.com/felixdittrich92/OnnxTR?tab=readme-ov-file but facing issues as it is not using GPU.
Code snippet to reproduce the bug
Error traceback
CPU memory usage increases insanely and crashes.
Environment
DocTR version: v0.8.1 TensorFlow version: 2.15.0 PyTorch version: 2.2.1+cu121 (torchvision 0.17.1+cu121) OpenCV version: 4.8.0 OS: Ubuntu 22.04.3 LTS Python version: 3.10.12 Is CUDA available (TensorFlow): Yes Is CUDA available (PyTorch): Yes CUDA runtime version: 12.2.140 GPU models and configuration: GPU 0: Tesla T4 Nvidia driver version: 535.104.05 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.6
Additional information
torch 2.2.1+cu121 torchaudio 2.2.1+cu121 torchdata 0.7.1 torchsummary 1.5.1 torchtext 0.17.1 torchvision 0.17.1+cu121
Deep Learning backend
is_tf_available: False is_torch_available: True