Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
5.94k stars 507 forks source link

Markdown Output #55

Open rferrazd opened 1 month ago

rferrazd commented 1 month ago

Hello,

In your paper it seemed that the model was able to extract the text and output it in Markdown format (with subtitles,headings, bold, etc). I am using your model from hugging face and I am not sure how to get the output in Markdown format. I have tried the following: res = model.chat(tokenizer, images[0], ocr_type='format with Markdown') What is the appropriate syntax to obtain the output in Markdown format? And, where can I read more about 'ocr_type', 'ocr_box', 'ocr_color', and 'render' - it is not present on the github repo.

Thank you for the help!

Ucas-HaoranWei commented 1 month ago

Hi, you can see the app.py on huggingface to see the details of how to output formatted results.

rferrazd commented 1 month ago

Hello, thanks for helping me with this. I believe that there is no app.py on huggingface

Screenshot 2024-09-19 at 8 13 59 AM
No360201 commented 1 month ago

Hello, thanks for helping me with this. I believe that there is no app.py on huggingface Screenshot 2024-09-19 at 8 13 59 AM

hhhh

Ucas-HaoranWei commented 1 month ago

In the demo space.

rferrazd commented 1 month ago

Hello @Ucas-HaoranWei, I am really sorry, but I am not finding it in the demo space. I see no explanation of how to get the output in Markdown format. I have only been able to get the output in Latex format. Could you kindly provide a screenshot of where the explanation of the syntax for getting the output in Markdown format is? I really appreciate the help, and congratulations on this amazing work!

ericg108 commented 1 month ago

@rferrazd the demo space code is here: https://huggingface.co/spaces/stepfun-ai/GOT_official_online_demo/blob/main/app.py but the output format is still mathpix markdown

plamb-viso commented 1 week ago

In case others find this and still don't get it, the model appears to default to mathpix-markdown (which to my untrained eye looked very similar to latex -- I thought it was latex).

rahulverma7788 commented 5 days ago

Hello,

In your paper it seemed that the model was able to extract the text and output it in Markdown format (with subtitles,headings, bold, etc). I am using your model from hugging face and I am not sure how to get the output in Markdown format. I have tried the following: res = model.chat(tokenizer, images[0], ocr_type='format with Markdown') What is the appropriate syntax to obtain the output in Markdown format? And, where can I read more about 'ocr_type', 'ocr_box', 'ocr_color', and 'render' - it is not present on the github repo.

Thank you for the help!

Have you got any solution?

plamb-viso commented 5 days ago

I'm not an authority on the subject, but I'm 90% sure that the model is outputting, by default, a flavor of markdown called mathpix-markdown which is essentially a combination of latex and markdown.

I do not believe it can output pure markdown. I've got a separate issue going here about parsing the mathpix-markdown response in python which would allow you to convert it to your desired format (e.g. pure markdown, html etc). I'm hoping the authors respond with a way to parse the response in python. If by chance you're operating in javascript/node then its your lucky day.