Open slyticoon opened 6 months ago
Hey I thought the exact same and forked the repository: https://github.com/valentinfrlch/ha-gpt4vision
There is no tts and instead the service call returns the string directly so you also don't need to check the response.txt file (perfect for automations). It includes a way to downscale the image as well, to save on cost. Also, it uses the new GPT-4o model.
Edit: You can install it via HACS too
Hey I thought the exact same and forked the repository: https://github.com/valentinfrlch/ha-gpt4vision
There is no tts and instead the service call returns the string directly so you also don't need to check the response.txt file (perfect for automations). It includes a way to downscale the image as well, to save on cost. Also, it uses the new GPT-4o model.
Edit: You can install it via HACS too
very nice, thankyou. BUT ohne question. how you save the answer and send it to a media player or a notificationservice? i cant find a example you to use it after create the image discription.
Hello,
Thank you for writing this component.
Would be a nice addition to make TTS optional. I am currently using a file type sensor to read the response and process it that way. That allows me to route it where I need to, perhaps eventually into an assistant.
Thanks!