simonw / llm-claude-3

LLM plugin for interacting with the Claude 3 family of models
Apache License 2.0
249 stars 23 forks source link

PDF support #22

Open simonw opened 1 week ago

simonw commented 1 week ago

https://docs.anthropic.com/en/docs/build-with-claude/pdf-support

The new Claude 3.5 Sonnet (claude-3-5-sonnet-20241022) model now supports PDF input and understands both text and visual content within documents.

Needs anthropic-beta: pdfs-2024-09-25 request header.

simonw commented 1 week ago

Got it working:

$ llm -m claude-3-5-sonnet-latest 'extract text' -a invoice.pdf
This is an invoice from OpenAI, LLC for various AI model usage between April 30 - May 31, 2023. The total amount due is $1.97 USD, due on June 16, 2023. The charges break down as follows:

1. Instruct models (davinci): 27,127 units at $0.00002 each = $0.54
2. Chat models (gpt-3.5-turbo): 154,109 units at $0.000002 each = $0.31
3. GPT-4 8K-context prompt: 27,808 units at $0.00003 each = $0.83
4. GPT-4 8K-context completion: 4,775 units at $0.00006 each = $0.29

The invoice is issued to an address in Half Moon Bay, California, and includes OpenAI's business address in San Francisco. The invoice number is A070F9EC-0017, and there's an option to pay online.
$ llm -m claude-3-haiku 'extract text' -a invoice.pdf          
Error: This model does not support attachments of type 'application/pdf', only image/png, image/webp, image/gif, image/jpeg
pricci1 commented 1 week ago
Redacted since an issue already existed (and was fixed)

Hi Simon! For some reason, it ignores the prompt, and always gives a variation of the same response for the same PDF file (extracts the details). ```bash $ llm -m claude-3.5-sonnet-latest 'Hello! How are you?' -a file.pdf This appears to be a phone and internet service bill. Here are the key details: Amount Due: ... ```