Open nikos-livathinos opened 2 years ago
This problem also exists when using the gpu mode
@nikos-livathinos Did you manage to find any workaround for this? We're encountering the same issue as you.
We're also using paddleocr==2.6.0.1
on a CPU; wondering if upgrading would help fix this.
For debugging, we tried an experiment in Python where we loop over a set of images, and each time, create a new PaddleOCR
object and then del
it immediately after.
The code is something like this:
lang = 'en'
for image_path in image_paths:
ocr = PaddleOCR(lang=lang, show_log=False, use_angle_cls=False)
_ = ocr.ocr(image_path, cls=False)
del ocr
We were hoping that by deleting the PaddleOCR
object each time, we could work around the memory leak issue by letting the garbage collector clear out any old memory usage after each call.
However, regardless of the language (en
or other language), we get a memory usage chart that looks something like this:
(Plotted using the mprof
tool from https://github.com/pythonprofilers/memory_profiler, with the --include_children
flag set.)
As you can see, the leaked memory seems to increase linearly with each new PaddleOCR
object that's used. The used memory never gets cleaned up, even though the old PaddleOCR
objects have been "del
eted" in Python.
If we use Python to call the CLI, we don't get a memleak, since the process is spawned and killed immediately after the OCR call.
lang = 'en'
for image_path in image_paths:
subprocess.check_output(
["paddleocr", "--image_dir", image_path, "--lang", lang]
)
Sample results:
If we don't del
the PaddleOCR
objects and simply use a single PaddleOCR
object to iterate over the images, we get results similar to those shown by @nikos-livathinos.
lang = 'en'
ocr = PaddleOCR(lang=lang, show_log=False, use_angle_cls=False)
for image_path in image_paths:
_ = ocr.ocr(image_path, cls=False)
Sample results:
System info:
OS: CentOS Linux 7 (Core)
python: 3.9.0
-----
pip install info:
paddleocr: 2.6.0.1 (same issue occurs with 2.6.1.2)
paddlepaddle: 2.4.1
-----
lscpu output (no GPUs):
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
Stepping: 1
CPU MHz: 2934.899
CPU max MHz: 3100.0000
CPU min MHz: 1200.0000
BogoMIPS: 4400.03
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-9,20-29
NUMA node1 CPU(s): 10-19,30-39
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_ppin intel_pt ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts spec_ctrl intel_stibp flush_l1d
@littletomatodonkey @LDOUBLEV Let me know if there's any information I can help provide the owners/maintainers of this project to help fix this memleak issue. We're trying to deploy PaddleOCR
as a service, and this memory leak is really hindering our ability to do so.
Thanks!
any news on this?
Any news on this? I also have the same problem running OCR with CPU/GPU with similar environment. I run through both AWS EC2 instance (t3.medium, g4dn.xlarge) and my local machine (CPU) and see memory increase infinitely.
In https://github.com/PaddlePaddle/PaddleOCR/issues/11639 , it is suggested that the memory leak problem will be resolved in oneDNN v3.4. Although https://github.com/PaddlePaddle/Paddle/pull/64661 has been applied in paddlepaddle v3.0.0, the leak issue persists even when using v3.0.0. If you have any insights regarding this, I would appreciate it if you could share them with me.
In #11639 , it is suggested that the memory leak problem will be resolved in oneDNN v3.4. Although PaddlePaddle/Paddle#64661 has been applied in paddlepaddle v3.0.0, the leak issue persists even when using v3.0.0. If you have any insights regarding this, I would appreciate it if you could share them with me.
Same question: is this resolved in the latest version of PaddleOCR (PaddlePaddle 3.0.0b1 / PaddleOCR 2.8.1)?
No it hasn't
Tried latest version of both paddleocr and paddlepaddle(upto 3.0.0b1), but this issue persists.
+
Even if you initiate PaddleOCR class once you will still see memory increasing infinitely.
Try paddlepaddle 2.5.2. that works.
But with latest version of paddlepaddle there was significant improvement in the speed of prediction. Due to memory issue it is difficult to use it.
I am experiencing the same issue on Azure environment, the unique work around is to restart the nodes ( not even the processes ) every while. Do someone has other short/long term solutions ?
Hi @pioardi, would you like to test the Paddle 3.0 beta to see if this issue is resolved? https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html
For future people facing the same error, limiting one process per node worked in my case. It is not that efficient but at least is working ( for now ) .
I have noticed some weird memory usage when evaluating the performance of PaddleOCR:
Eventually it is impossible to keep running a PaddleOCR process as a service because the system runs out of memory and the process is killed.
Can you provide insights on this memory usage pattern? Do you have any remedy?
The following sections describe the tests in detail.
C++ Tests
Setup for the C++ tests
Test hardware:
C++ compiler:
gcc v9.4.0
PaddleOCR: Source code from
v2.6.0
Paddle library:
Model files:
en_PP-OCRv3_det_infer.tar
ch_ppocr_mobile_v2.0_cls_infer.tar
en_PP-OCRv3_rec_infer.tar
OpenCV has been compiled from the source code tagged at
v4.6.0
with the parameters:Methodology for the C++ tests
v2.6.0
(https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.6/deploy/cpp_infer)cmd arguments:
Results of the C++ tests
Test 1: Base
Test 2: Long run
Test 3: No MKL
Test 4: Det only
Test 5: Det + Rec
Test 6: Det + Cls
Test 7: Loop same image
Python tests
Setup for the Python tests
Test hardware:
paddlepaddle:
v2.3.2
paddleocr:
v2.6.0.1
Results of the Python tests