run-llama / llama_parse

Parse files for optimal RAG
https://www.llamaindex.ai
MIT License
2.44k stars 248 forks source link

Failed to interpret bar charts correctly #272

Open oy321 opened 2 months ago

oy321 commented 2 months ago

image

I am trying to parse this image but the result is very wrong:

image

even though I asked it specifically in the prompt to pay attention to the relationship between text and chart elements.

hewliyang commented 2 months ago

+1 here. Charts / any images for that matter are being hallucinated as Markdown tables. Specifying instructions doesn't help either. Are you guys passing all images to a VLM with conversion to MD table as a instruction?

JSON return type doesn't work now either.

BinaryBrain commented 2 months ago

We currently don't support this very well because understanding blocks of color in a PDF is a difficult problem. You should try the gpt4o mode as it may give you a more accurate result.