large-vision-language-model Search Results

changh95/WeeklySpatialAI #15

2024.10.30 - #13 - Vision-Language Model (VLM), Large Spatia…

# Interesting papers ## Meta의 'An Introduction to Vision-Language Modeling' - https://ai.meta.com/research/publications/an-introduction-to-vision-language-modeling/ ![image](https://github.c…

changh95 updated 3 weeks ago

sangminwoo/awesome-vision-and-language #13

Add a CVPR 2024 paper

Could you add our CVPR 2024 paper about vision-language pertaining, "Iterated Learning Improves Compositionality in Large Vision-Language Models", into this repo? Paper link: https://arxiv.org/abs/…

hellomuffin updated 1 day ago

xp1632/Aalen_working_log #3

`ViperGPT` and its related paper : compositional visual infe…

- While the Latex env is not fully set, we'll write our thoughts here for now ---- - `ViperGPT` is a framework that leverages the pre-trained vision language models (`GLIP` for image object ground…

xp1632 updated 1 day ago

unslothai/unsloth #1319

How to fine-tune LLaMA 3.2 11B Vision using LoRA with the re…

I saw you used something like this: ``` model = FastVisionModel.get_peft_model( model, finetune_vision_layers = True, # False if not finetuning vision part finetune_language_lay…

yukiarimo updated 3 days ago

yuezih/less-is-more #8

Unknown vision tower using haotianliu's llava repo

Hello, when I use the official repository and environment of `haotianliu`'s `llava` repository, and follow the official installation methods: ``` bash conda create -n llava python=3.10 -y conda act…

pspdada updated 4 hours ago

KejiaZhang-Robust/Adversarial-Robustness-Papers #1

NeurIPS 2023相关论文

同学你好，非常感谢你对这一系列论文的整理和梳理，真的帮助很大！在阅读文献时注意到，仓库中部分标注为“2024-NeurIPS”的论文是“2023-NeurIPS”。以下是我发现的相关论文列表，供参考： 2023-NeurIPS：[Enhancing Adversarial Contrastive Learning via Adversarial Invariant Regularizatio…

lightrain-a updated 22 hours ago

VectorSpaceLab/Video-XL #13

Include discussions about VoCo-LLaMA in your paper.

Dear authors, @shuyansy @UnableToUseGit I kindly think you need to discuss VoCo-LLaMA[1] in the "Intro" section of your paper at the very least. As I find the citation and discussions related to …

Yxxxb updated 1 week ago

shikiw/Awesome-MLLM-Hallucination #2

Add Paper Request

Dear shikiw, Thank you for your valuable effort in curating research on MLLM hallucination! This excellent repository is impressively comprehensive and provides researchers with a clear sense of th…

Ruiyang-061X updated 2 hours ago

unslothai/unsloth #1326

qwen2-vl 2b 4-bit always getting OOM, yet llama3.2 11b works…

qwen2-vl has always been memory hungry (compared to the other vision models) and even with unsloth it still OOMs when the largest llama3.2 11b works fine. I'm using a dataset that has high resolution…

mehamednews updated 19 hours ago

xp1632/DFKI_working_log #75

`MLMM: Multi Modal Large Language Model`

- Here's the summary of consulting a LLM specialist: --- - We have an initial thought in #74 as follows: ![image](https://github.com/user-attachments/assets/265a3d7d-0454-4e7b-9c99-a0dd9f9ecf7c…

xp1632 updated 5 days ago

1000+ results for large-vision-language-model

1000+ results
for large-vision-language-model