SkunkworksAI / BakLLaVA

Apache License 2.0
680 stars 46 forks source link

What's this for? #1

Closed legut2 closed 9 months ago

legut2 commented 9 months ago

I'm curious about this project and what's the motivation. I'm a computer vision geek and wanted to know what you two are toying around with here. What are you trying to have it do?

I'm curious about the multimodal models in general when it comes to images and text.

pharaouk commented 9 months ago

This is a fork of LLaVA, to work with Mistral. Objective is to push the vision LLM even further than what LLaVA team has accomplished. We'll be taking it in a different direction with many ideas in store to extend its capabilities, you're more than welcome to help contribute if you're interested.