Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
6.18k stars 534 forks source link

能否获得OCR结果的bounding box和对应文本 #24

Open YFCodeDream opened 2 months ago

YFCodeDream commented 2 months ago

v1_1_0 例如我想针对上述图像获得:

[
    {"text": "Solve the following equations:", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
    {"text": "1) 8x + 11 = 4x + 14", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
    {"text": "2) 7d - 4 = 11d - 9", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
]

这样的json数据,该工具可否完成?目前我在GOT/demo/run_ocr_2.0.py未找到相应功能。期待您的回复!

Ucas-HaoranWei commented 2 months ago

当前版本没有检测功能,需要使用类似数据post train才行

YFCodeDream commented 2 months ago

感谢您的及时回复!如果开发者能够考虑补充检测功能,这个工具一定会更好 :)

cherish24 commented 1 month ago

确实缺少一个获取坐标位置的功能

huacong commented 4 weeks ago

v1_1_0 例如我想针对上述图像获得:

[
    {"text": "Solve the following equations:", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
    {"text": "1) 8x + 11 = 4x + 14", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
    {"text": "2) 7d - 4 = 11d - 9", "bbox": "[<x1>, <y1>, <x2>, <y2>]"},
]

这样的json数据,该工具可否完成?目前我在GOT/demo/run_ocr_2.0.py未找到相应功能。期待您的回复!

您好呀,我现在这个项目开发也有这个需求,希望得到这样的json数据,text以及对应的box位置坐标,请问您有找到相应的实现吗?万分感谢!