PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Apache License 2.0
40.31k stars 7.46k forks source link

Potential memory leak in PaddleOCR? #7823

Open nikos-livathinos opened 1 year ago

nikos-livathinos commented 1 year ago

I have noticed some weird memory usage when evaluating the performance of PaddleOCR:

Eventually it is impossible to keep running a PaddleOCR process as a service because the system runs out of memory and the process is killed.

Can you provide insights on this memory usage pattern? Do you have any remedy?

The following sections describe the tests in detail.

C++ Tests

Setup for the C++ tests

Methodology for the C++ tests

Results of the C++ tests

Test 1: Base

ppocr_supplier_bboxes_dt

Test 2: Long run

ppocr_supplier_x8_bboxes_dt

Test 3: No MKL

ppocr_supplier_bboxes_dt_nomkl

Test 4: Det only

ppocr_supplier_det_bboxes_dt

Test 5: Det + Rec

ppocr_supplier_det_rec_bboxes_dt

Test 6: Det + Cls

ppocr_supplier_det_cls_bboxes_dt

Test 7: Loop same image

ppocr_000AVY01_1320_bboxes_dt

Python tests

Setup for the Python tests

Results of the Python tests

performance_supplier

lucky2046 commented 1 year ago

This problem also exists when using the gpu mode

lucashu1 commented 1 year ago

@nikos-livathinos Did you manage to find any workaround for this? We're encountering the same issue as you.

We're also using paddleocr==2.6.0.1 on a CPU; wondering if upgrading would help fix this.

lucashu1 commented 1 year ago

Experiment 1

For debugging, we tried an experiment in Python where we loop over a set of images, and each time, create a new PaddleOCR object and then del it immediately after.

The code is something like this:

lang = 'en'
for image_path in image_paths:
    ocr = PaddleOCR(lang=lang, show_log=False, use_angle_cls=False)
    _ = ocr.ocr(image_path, cls=False)
    del ocr

We were hoping that by deleting the PaddleOCR object each time, we could work around the memory leak issue by letting the garbage collector clear out any old memory usage after each call.

However, regardless of the language (en or other language), we get a memory usage chart that looks something like this:

DHjGhK4

(Plotted using the mprof tool from https://github.com/pythonprofilers/memory_profiler, with the --include_children flag set.)

As you can see, the leaked memory seems to increase linearly with each new PaddleOCR object that's used. The used memory never gets cleaned up, even though the old PaddleOCR objects have been "deleted" in Python.

Experiment 2

If we use Python to call the CLI, we don't get a memleak, since the process is spawned and killed immediately after the OCR call.

lang = 'en'
for image_path in image_paths:
    subprocess.check_output(
        ["paddleocr", "--image_dir", image_path, "--lang", lang]
    )

Sample results:

zMZfj8o

Experiment 3

If we don't del the PaddleOCR objects and simply use a single PaddleOCR object to iterate over the images, we get results similar to those shown by @nikos-livathinos.

lang = 'en'
ocr = PaddleOCR(lang=lang, show_log=False, use_angle_cls=False)
for image_path in image_paths:
    _ = ocr.ocr(image_path, cls=False)

Sample results: wyzLn5J

Other info

System info:

OS: CentOS Linux 7 (Core)
python: 3.9.0

-----

pip install info:

paddleocr: 2.6.0.1 (same issue occurs with 2.6.1.2)
paddlepaddle: 2.4.1

-----

lscpu output (no GPUs):

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                40
On-line CPU(s) list:   0-39
Thread(s) per core:    2
Core(s) per socket:    10
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
Stepping:              1
CPU MHz:               2934.899
CPU max MHz:           3100.0000
CPU min MHz:           1200.0000
BogoMIPS:              4400.03
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              25600K
NUMA node0 CPU(s):     0-9,20-29
NUMA node1 CPU(s):     10-19,30-39
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_ppin intel_pt ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts spec_ctrl intel_stibp flush_l1d

@littletomatodonkey @LDOUBLEV Let me know if there's any information I can help provide the owners/maintainers of this project to help fix this memleak issue. We're trying to deploy PaddleOCR as a service, and this memory leak is really hindering our ability to do so.

Thanks!

Siegi96 commented 1 year ago

any news on this?

Mangoboo commented 1 year ago

Any news on this? I also have the same problem running OCR with CPU/GPU with similar environment. I run through both AWS EC2 instance (t3.medium, g4dn.xlarge) and my local machine (CPU) and see memory increase infinitely.