run-llama / llama_parse

Parse files for optimal RAG
https://www.llamaindex.ai
MIT License
2.72k stars 263 forks source link

Struture not correctly detected with this pdf #166

Open demysc opened 5 months ago

demysc commented 5 months ago

Hi,

Struture not correctly detected with this pdf https://www.inail.it/cs/internet/docs/alg-circolare-n-10-16-aprile-2024.pdf

Most sections are not correctly detected in pdf, the titles appear as inline text, next to the paragraph see screenshots image image

in this case the section is detected however the list items are not consistently deteted, sometimes they are interpreted as new section image