能否获得OCR结果的bounding box和对应文本

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

6.18k stars 534 forks source link

能否获得OCR结果的bounding box和对应文本 #24

Open YFCodeDream opened 2 months ago

YFCodeDream commented 2 months ago

v1_1_0 例如我想针对上述图像获得：

[
    {"text": "Solve the following equations:", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
    {"text": "1) 8x + 11 = 4x + 14", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
    {"text": "2) 7d - 4 = 11d - 9", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
]

这样的json数据，该工具可否完成？目前我在GOT/demo/run_ocr_2.0.py未找到相应功能。期待您的回复！

Ucas-HaoranWei commented 2 months ago

当前版本没有检测功能，需要使用类似数据post train才行

YFCodeDream commented 2 months ago

感谢您的及时回复！如果开发者能够考虑补充检测功能，这个工具一定会更好 :)

cherish24 commented 1 month ago

确实缺少一个获取坐标位置的功能

huacong commented 4 weeks ago

例如我想针对上述图像获得：
[
    {"text": "Solve the following equations:", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
    {"text": "1) 8x + 11 = 4x + 14", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
    {"text": "2) 7d - 4 = 11d - 9", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
]
这样的json数据，该工具可否完成？目前我在GOT/demo/run_ocr_2.0.py未找到相应功能。期待您的回复！

您好呀，我现在这个项目开发也有这个需求，希望得到这样的json数据，text以及对应的box位置坐标，请问您有找到相应的实现吗？万分感谢！