VikParuchuri / surya

OCR, layout analysis, reading order, table recognition in 90+ languages
https://www.datalab.to
GNU General Public License v3.0
14.29k stars 889 forks source link

A problem of identifying images in PDF files. #239

Open Wyzanezan opened 2 weeks ago

Wyzanezan commented 2 weeks ago

20241107-163033

The above picture is part of a PDF file, but "layout recognition" only recognizes a few numbers(190 394 266 1670 1000 1200), and the graphic above is not recognized, the recognized label is "Figure"; but I think these graphics should be recognized as "Picture". How can I make these graphics recognized as "Picture" types and recognize the shapes?