run-llama / llama_parse

Parse files for optimal RAG
https://www.llamaindex.ai
MIT License
2.49k stars 251 forks source link

Link images in markdown #407

Open spreeni opened 2 days ago

spreeni commented 2 days ago

Disclaimer: I have only experimented with the Web-UI at LlamaCloud and have not tried the library

Currently, I can see that images are extracted from my PDF, but exist separately from the Markdown. It would be nice to have them embedded in the Markdown as filepath-links to conserve image placement within a page.

BinaryBrain commented 2 days ago

Hi @spreeni, The library gives the same results as the Web UI. While I think this would be good as well, it's a difficult problem because you can have background images, images that are 2-columns wide, misalignments, etc.

spreeni commented 2 days ago

Thanks for the reply! Yes, i already imagined that it is not trivial. Just thought I'd open an issue so it can be in the backlog. This would especially help with images with labels within documents.