hdresearch / nolita

Work with web-enabled agents quickly — whether running a quick task or bootstrapping a full-stack product.
https://nolita.ai
MIT License
88 stars 6 forks source link

Add support for vision models #5

Open matildepark opened 8 months ago

matildepark commented 8 months ago

After we attempted forking off llm-api, we currently need to write cases for integrating vision models with the core browse loop.

In the case of llm-api, the entire library is constructed around user messages being a string, and image data creates a sub-type where we can either return a string or an array of messages, and this ended up being a type mess to unwrangle.