valentinfrlch / ha-llmvision

Let Home Assistant see!
Apache License 2.0
117 stars 4 forks source link

Anthropic Claude Support #22

Closed WouterJN closed 2 months ago

WouterJN commented 2 months ago

I've been experimenting with Anthropic Claude 3.5, and I am impressed with its performance and results. I noticed that it is possible to upload images through the API. Do you have any plans to support this LLM?

See: https://docs.anthropic.com/en/api/messages-examples

valentinfrlch commented 2 months ago

Sure! I'll look into it.

valentinfrlch commented 2 months ago

I have added Anthropic Claude as a new provider. It is now available in the latest beta version. Since I have also made quite a few changes to other things, this version needs to be tested before releasing it on stable.

Could you download the beta and check if your automations still works? The beta shouldn't break anything and I have tested it, just want to be extra sure.

To download the beta, go to HACS>Integrations>GPT-4 Vision, select download again in the ... menu. Then check Show beta versions, select v0.4.2-dev and restart.

valentinfrlch commented 2 months ago

Btw, I just plotted the performance of different models' vision capabilities and it seems Google's Gemini needs to be added next: image Edit: Added Claude 3.5 Sonnet which could be a good (slightly cheaper) alternative to 4o

WouterJN commented 2 months ago

Thnx @valentinfrlch, I will try tomorrow or Monday, bit busy this weekend.

The Google Gemini image analysis feature is built into Home Assistant by default. Adding it to your plugin only offers the advantage of being able to adjust the resolution of the uploaded images.

See: https://www.home-assistant.io/integrations/google_generative_ai_conversation/

I’m not so impressed with the results myself, for me ChatGPT 4o works better. I run both parallel to analyze the pictures of my doorbell.

valentinfrlch commented 2 months ago

Thanks for the info, I thought generative AI would - as the name suggests - generate images. I suppose there is indeed no need to have this as a provider, though I think having one integration with all providers would still be appealing. No stress on the testing though.

WouterJN commented 2 months ago

Hi @valentinfrlch;

Updating to version 0.4.3 DEV leads to two problems:

  1. When using OpenAI, it appears that no pictures are being uploaded. The response from OpenAI indicates that there are no pictures to analyze. I tried multiple queries that previously worked without any issues.
  2. Adding Anthropic results in the following error.
Screenshot 2024-06-23 at 15 00 50

r;

valentinfrlch commented 2 months ago

Hmm that's weird, I just redownloaded 0.4.3-dev to check this. Anthropic setup works for me as well the requests. Sometimes you have to refresh the front-end cache (in chrome you can open developer tools, then right-click the refresh button and select hard reload).

WouterJN commented 2 months ago

I tried using both Safari and Chrome (which I had never used for HA before).

The error message I received for OpenAI is: 2024-06-23 15:37:16.958 WARNING (MainThread) [homeassistant.util.loop] Detected blocking call to open inside the event loop by custom integration 'gpt4vision' at custom_components/gpt4vision/__init__.py, line 121: with Image.open(image_path) as img: (offender: /usr/local/lib/python3.12/site-packages/PIL/Image.py, line 3277: fp = builtins.open(filename, "rb")), please create a bug report at https://github.com/valentinfrlch/ha-gpt4vision/issues Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/usr/src/homeassistant/homeassistant/__main__.py", line 223, in <module> sys.exit(main()) File "/usr/src/homeassistant/homeassistant/__main__.py", line 209, in main exit_code = runner.run(runtime_conf) File "/usr/src/homeassistant/homeassistant/runner.py", line 190, in run return loop.run_until_complete(setup_and_run_hass(runtime_config)) File "/usr/local/lib/python3.12/asyncio/base_events.py", line 672, in run_until_complete self.run_forever() File "/usr/local/lib/python3.12/asyncio/base_events.py", line 639, in run_forever self._run_once() File "/usr/local/lib/python3.12/asyncio/base_events.py", line 1988, in _run_once handle._run() File "/usr/local/lib/python3.12/asyncio/events.py", line 88, in _run self._context.run(self._callback, *self._args) File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 464, in async_run await self._async_step(log_exceptions=False) File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 526, in _async_step await getattr(self, handler)() File "/usr/src/homeassistant/homeassistant/helpers/script.py", line 764, in _async_call_service_step self._hass.async_create_task_internal( File "/usr/src/homeassistant/homeassistant/core.py", line 828, in async_create_task_internal task = create_eager_task(target, name=name, loop=self.loop) File "/usr/src/homeassistant/homeassistant/util/async_.py", line 37, in create_eager_task return Task(coro, loop=loop, name=name, eager_start=True) File "/usr/src/homeassistant/homeassistant/core.py", line 2741, in async_call response_data = await coro File "/usr/src/homeassistant/homeassistant/core.py", line 2784, in _execute_service return await target(service_call) File "/config/custom_components/gpt4vision/__init__.py", line 185, in image_analyzer base64_image = encode_image(image_path=image_path) File "/config/custom_components/gpt4vision/__init__.py", line 121, in encode_image with Image.open(image_path) as img: The error message I received for Anthropic is:

2024-06-23 15:35:35.168 DEBUG (MainThread) [custom_components.gpt4vision.config_flow] Selected provider: Anthropic 2024-06-23 15:35:35.168 DEBUG (MainThread) [custom_components.gpt4vision.config_flow] Configured providers: ['OpenAI'] 2024-06-23 15:35:37.195 DEBUG (MainThread) [custom_components.gpt4vision.config_flow] Connecting to https://api.anthropic.com/v1/messages 2024-06-23 15:35:37.372 ERROR (MainThread) [custom_components.gpt4vision.config_flow] Handshake failed with status: 400 2024-06-23 15:35:37.373 ERROR (MainThread) [custom_components.gpt4vision.config_flow] Could not connect to Anthropic server. 2024-06-23 15:35:37.373 ERROR (MainThread) [custom_components.gpt4vision.config_flow] Validation failed: handshake_failed

valentinfrlch commented 2 months ago

Alright, I found the error. the Image.open() was blocking the loop. It runs asynchronously now, so this issue is fixed.

However, I cannot reproduce the Anthropic setup issue you're experiencing. Are you sure the API key is valid? Still it's weird because status code 400 implies "bad request" which would suggest there is something wrong with the way the request is sent.

I'm going to create an updated release, so if you've got some extra time you can try if it works again. No pressure though.

WouterJN commented 2 months ago

ChatGPT is working again. Thanks a lot :-)

I believe I have also found the issue with the Anthropic API. Last night, I created a small test script to call the API, but it returned the following error. I will add some funds later today, test you plugging again, and let you know the results.

{ "type": "error", "error": { "type": "invalid_request_error", "message": "Your credit balance is too low to access the Claude API. Please go to Plans & Billing to upgrade or purchase credits." } }

WouterJN commented 2 months ago

I've added some funds to Anthropic, and it's now working perfectly! Thank you for your effort!

The cost for analyzing three images, each with a width of 1280 pixels, and returning a JSON looks very reasonable. Over the next few days or weeks, I will run all three models (OpenAI, Gemini, and Anthropic) in parallel to determine which one I prefer.

Screenshot 2024-06-24 at 09 34 39
valentinfrlch commented 2 months ago

Glad to hear it's working again. And thank you too, I would not have found that bug without you!