Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
4.8k stars 394 forks source link

Markdown Output #55

Open rferrazd opened 2 weeks ago

rferrazd commented 2 weeks ago

Hello,

In your paper it seemed that the model was able to extract the text and output it in Markdown format (with subtitles,headings, bold, etc). I am using your model from hugging face and I am not sure how to get the output in Markdown format. I have tried the following: res = model.chat(tokenizer, images[0], ocr_type='format with Markdown') What is the appropriate syntax to obtain the output in Markdown format? And, where can I read more about 'ocr_type', 'ocr_box', 'ocr_color', and 'render' - it is not present on the github repo.

Thank you for the help!

Ucas-HaoranWei commented 1 week ago

Hi, you can see the app.py on huggingface to see the details of how to output formatted results.

rferrazd commented 1 week ago

Hello, thanks for helping me with this. I believe that there is no app.py on huggingface

Screenshot 2024-09-19 at 8 13 59 AM
No360201 commented 1 week ago

Hello, thanks for helping me with this. I believe that there is no app.py on huggingface Screenshot 2024-09-19 at 8 13 59 AM

hhhh

Ucas-HaoranWei commented 1 week ago

In the demo space.

rferrazd commented 1 week ago

Hello @Ucas-HaoranWei, I am really sorry, but I am not finding it in the demo space. I see no explanation of how to get the output in Markdown format. I have only been able to get the output in Latex format. Could you kindly provide a screenshot of where the explanation of the syntax for getting the output in Markdown format is? I really appreciate the help, and congratulations on this amazing work!

ericg108 commented 1 week ago

@rferrazd the demo space code is here: https://huggingface.co/spaces/stepfun-ai/GOT_official_online_demo/blob/main/app.py but the output format is still mathpix markdown