PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Apache License 2.0
38.99k stars 7.32k forks source link

Batch processing using paddleocr #12012

Closed saanvib13 closed 2 weeks ago

saanvib13 commented 2 weeks ago

I am using the PaddleOCR library to detect and extract tables from images. Firstly, I am running the layout detection on an image, if the layout detector detects a table in the image, then I am sending the image to the PaddleOCR. However, it does not natively support batch processing due to which the execution time is scaling linearly with the number of images being processed. The library processes one image at a time. Is there any way to process multiple images simultaneously at the gpu using paddleOCR so that the execution time does not scale linearly with the the number of images? Or are there any open source alternatives to PaddleOCR to accomplish the task?

I am trying to detect tables in images and extract them in a csv file and I want to implement batch processing so that the gpu utilization is efficient and execution time is minimized

changdazhou commented 2 weeks ago

We do not support it, we will evaluate your needs and consider whether to accomplish this task