Aider-AI / aider

aider is AI pair programming in your terminal
https://aider.chat/
Apache License 2.0
20.75k stars 1.91k forks source link

Can you add pdf file support #444

Open SpeederSpeederSpeder opened 9 months ago

SpeederSpeederSpeder commented 9 months ago

Can you add pdf file support ?

gwpl commented 9 months ago

Glad to see growing interest in Aider. I love this tool as well.

PDFs are very difficult. I recommend first trying to use some tools to convert them into text and see how it goes.

You will find that for some text is embedded in a way that simple tools do extraction well, and sometimes, e.g. multicolumn texts, where you have lines mixed.

Therefore doing OCR becomes best and computational intensive way, here are some tools you may like to try:

Then, you can include output of those tools into aider. GPT seems to be great in understanding LaTeX , therefore I would go with nougat to have more rich information, still it will differ from document to document.

jordillonx commented 9 months ago

I'd love this feature too! It would ease adding documentation to the context, which is a very frequent task while using aider.

On similar lines, the ability to add a url as context would be great too: /add https://example.com/documentation

SgtPooki commented 7 months ago

On similar lines, the ability to add a url as context would be great too: /add https://example.com/documentation

I'm still browsing through open issues, but wondering if /run DOCS=$(curl https://example.com/documentation) && echo "The documentation for "blah" is: \"${DOCS}\"" or similar simple script do the trick?

paul-gauthier commented 7 months ago

You can do /web <url> to add any url to the chat.

gwpl commented 7 months ago

What I normally do , I make separate branch with folder docs amd put there collection or relevent docs converted to markdown so they work more efficiently, and later I cherry pick or rebase CLs that aider creates on a top of it. That way I can reliably reuse created corpus of documentation/examples, and add/remove them to/from context as needed using /add /drop . What I find more problematic is that I can not set those files as read-only (what was subject of my other feature request... ;) ).

On Tue, 19 Mar 2024, 20:01 paul-gauthier, @.***> wrote:

You can do /web to add any url to the chat.

— Reply to this email directly, view it on GitHub https://github.com/paul-gauthier/aider/issues/444#issuecomment-2007920808, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABWBW2SKWZI3F4YGWJIYJ3YZCDQXAVCNFSM6AAAAABBX75FUKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBXHEZDAOBQHA . You are receiving this because you commented.Message ID: @.***>

rawwerks commented 7 months ago

if pdf support does become a priority (and to be honest, i'm not sure it should be for this project), i would recommend llamaparse, which is better than anything i've seen (paid or unpaid, API or library).

gwpl commented 5 months ago

Or nougat-ocr ( https://facebookresearch.github.io/nougat/ => https://github.com/facebookresearch/nougat ), if one wants to run on claud ai , then replicate has simple api : https://replicate.com/search?query=nougat , ofc hugging face and others work as well.

tdobson commented 1 month ago

I thought about doing /read documentation.pdf today.

Instead I used pdftotext to make it text, but yeah - I can see the utility of something like this one day.