vision-language-model Search Results

dottxt-ai/outlines #1285

Image Classification Using vision language models

### Describe the issue as clearly as possible: I tried the implementation in the docs : [https://dottxt-ai.github.io/outlines/latest/reference/models/transformers_vision/#classifying-an-image] I…

RajSimpi9988 updated 1 day ago

QuivrHQ/quivr #3452

Enable use of embeddings from Vision Language models

Currently, we use text embeddings. This is fine for textual documents, while it present obvious drawbacks for documents containing non-textual content (images, graphs, schemes, …). An alternative, is…

jacopo-chevallard updated 3 weeks ago

sangminwoo/awesome-vision-and-language #13

Add a CVPR 2024 paper

Could you add our CVPR 2024 paper about vision-language pertaining, "Iterated Learning Improves Compositionality in Large Vision-Language Models", into this repo? Paper link: https://arxiv.org/abs/…

hellomuffin updated 2 days ago

changh95/WeeklySpatialAI #15

2024.10.30 - #13 - Vision-Language Model (VLM), Large Spatia…

# Interesting papers ## Meta의 'An Introduction to Vision-Language Modeling' - https://ai.meta.com/research/publications/an-introduction-to-vision-language-modeling/ ![image](https://github.c…

changh95 updated 3 weeks ago

xp1632/Aalen_working_log #3

`ViperGPT` and its related paper : compositional visual infe…

- While the Latex env is not fully set, we'll write our thoughts here for now ---- - `ViperGPT` is a framework that leverages the pre-trained vision language models (`GLIP` for image object ground…

xp1632 updated 1 hour ago

BradyFU/Awesome-Multimodal-Large-Language-Models #191

Addition: GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 20…

Dear Authors, We'd like to add "GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning" to this repository, which has been accepted by NeurIPS 2024. [**Paper**](https:/…

WEIYanbin1999 updated 6 days ago

OpenGVLab/InternVL #716

[Docs] For intergrading

### 📚 The doc issue Is there any tutor for integrating the vision model with the language model? ### Suggest a potential alternative/fix _No response_

Vital1162 updated 3 days ago

Ruiyang-061X/VL-Uncertainty #1

Great work!

Hey, I am Zhiqiu Lin, a final-year PhD student at Carnegie Mellon University working with Prof. Deva Ramanan. Your work is very interesting with great performance gains! I wanted to share [Natu…

linzhiqiu updated 1 day ago

unslothai/unsloth #1319

How to fine-tune LLaMA 3.2 11B Vision using LoRA with the re…

I saw you used something like this: ``` model = FastVisionModel.get_peft_model( model, finetune_vision_layers = True, # False if not finetuning vision part finetune_language_lay…

yukiarimo updated 4 days ago

vllm-project/vllm #7795

[Usage]: Is there any way to hook features inside vision-lan…

### Your current environment # Initialize the LLaVA-1.5 model llm = LLM(model="llava-hf/llava-1.5-7b-hf") print(llm) #embed_last_hook = Hook(model.language_model.model.norm) # for save embed …

minuenergy updated 13 hours ago

1000+ results for vision-language-model

1000+ results
for vision-language-model