Closed fearnworks closed 1 month ago
It's awesome! We'll need to link to it from mkdocs.yml
and from the cookbooks' index page :)
It's awesome! We'll need to link to it from
mkdocs.yml
and from the cookbooks' index page :)
Updated!
Thank you so much for your contribution!
Request received in discord to add an example for the new transformers vision capability.
Vision-Language Models with Outlines
This guide demonstrates how to use Outlines with vision-language models, leveraging the new transformers_vision module. Vision-language models can process both text and images, allowing for tasks like image captioning, visual question answering, and more.
We will be using the Pixtral-12B model from Mistral to take advantage of some of its visual reasoning capabilities and a workflow to generate a multistage atomic caption.