Closed dev-msp closed 1 month ago
I've been testing vision with the action nodes. got it working like this:
Thanks so much for trying cannoli!
Yes this would be awesome, and with all the new API features out it's high on my list.
Also, awesome work solarbotanist, are you saying that this is working as intended, and the images are being sent and received correctly? That's amazing! I always wanted to write an openai action node for the meme, but hadn't yet.
Yes everything is working as intended! The action node as described can proces both online image URI's as well as base64 encoded images.
A base64 encoding function could be implemented in the TS so that images in the vault could be used directly.
any updates on this?
Not yet, but I just looked at the vision docs for OpenAI at least, and it's not as bad as I thought it would be. So maybe not that far off after all.
Vision is now implemented! All you have to do is embed an image from a url or your vault using markdown syntax: ![[your_image.jpg]]
or ![description](imageurl.com)
We've also added a built-in dalle action.
Just discovered this, and I both detest and greatly respect that you actually built (and surpassed) something I merely daydreamed about idly one day. Congrats! :D
As of recently, images serve as both LLM input (Vision API beta) and output (DALL-E). Now I can't help but wonder about the potential for image nodes in Cannoli!
I haven't familiarized myself with the code yet, but in theory I'm very much down to contribute as well.