Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
Apache License 2.0
38.99k
stars
7.32k
forks
source link
PPStructure: unable to recognize a fairly easy structure #12036
I am trying to parse this PDF using PaddleOCR 2.7.3.
I tried converting the pages as images, and then run PPStructure on them. I tried with the following options:
but the results in the second page of the document are not satisfactory:
I also tried with the model
ppyolov2_r50vd_dcn_365e_publaynet
:but the program stops at an error:
InvalidArgumentError: The size of Op(Conv) inputs should not be 0.
Any suggestion on how to correctly parse this pdf?
Thank you!