visual-instruction-tuning Search Results

695 results
for visual-instruction-tuning

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

haotian-liu/LLaVA #1630

[Usage] Visual instruction tuning for LLaVa 1.6

### Describe the issue Issue: We are trying to finetune the model on our dataset. Currently, we are able to successfully finetune model `lmsys/vicuna-13b-v1.5` using projector weights `llava-v…

mattia-re-learn updated 1 week ago
6
long8v/PTIR #128

[119] Visual Instruction Tuning

[paper](https://arxiv.org/abs/2304.08485) ## TL;DR - **I read this because.. :** llava 1.5를 읽기 위해 - **task :** chatting VLM - **problem :** chatGPT처럼 multi-modal에서도 instruction-following하…

long8v updated 9 months ago
1
haotian-liu/LLaVA #1537

Does vistion tower trained during starge 2 (Visual Instructi…

I find `@torch.no_grad()` in CLIPVisionTower.forward(), so it won't flow gradient to CLIP while training. https://github.com/haotian-liu/LLaVA/blob/c121f0432da27facab705978f83c4ada465e46fd/llava/mo…

GoGoJoestar updated 2 months ago
2
long8v/PTIR #152

[140] Improved Baselines with Visual Instruction Tuning

[paper](https://arxiv.org/pdf/2310.03744.pdf) see llava https://github.com/long8v/PTIR/issues/128#issue-1749571159 here ## TL;DR - **I read this because.. :** aka LLaVA1.5 / ShareGPT4V에서 LL…

long8v updated 7 months ago
1
haotian-liu/LLaVA #560

[Usage] Training speed for visual instruction tuning

### Describe the issue Issue: I am finetuning llava1.5-7B on 8 * A100 40G, and modified bs & accumulation steps accordingly. The estimated training time is approx. 24h. What could go wrong? En…

DietDietDiet updated 11 months ago
1
AkihikoWatanabe/paper_notes #1068

Improved Baselines with Visual Instruction Tuning, Haotian L…

# URL - https://arxiv.org/abs/2310.03744 # Affiliations - Haotian Liu, N/A - Chunyuan Li, N/A - Yuheng Li, N/A - Yong Jae Lee, N/A # Abstract - Large multimodal models (LMM) have recently sh…

AkihikoWatanabe updated 5 months ago
2
haotian-liu/LLaVA #934

[Usage] OOM when using single A100-40G x8 node (Visual Instr…

Hi Haotian, OOM happened when I ran "finetune.sh" from scripts/v1_5. I used single node A100-40G x8, **without nvlink** to fine-tune a 7B LLaVA-1.5. The estimated training time is ~24 hours whe…

zilunzhang updated 6 months ago
1
haotian-liu/LLaVA #878

[Question] training loss curve

### Question 1. could you explain the loss of llava 1.5 is higher than llava (I think both pretraining and Visual Instruction Tuning stage), but achieve better result? 2. also, why did the **spike**…

bpwl0121 updated 5 months ago
5
haotian-liu/LLaVA #1551

[Discussion] Request for Guidance on Stage Two: Converting L…

### Discussion ### LLaVA-Med V1.6: Training a Large Language-and-Vision Assistant for Biomedicine in Two and Half Hours #### Abstract Large Language Models (LLMs) have revolutionized natural la…

rohithbojja updated 1 month ago
2
xijia-tao/ImgTrojan #3

What is the difference between Clean Model and Vanilla

Hello. Thank you for your excellent work.I have some questions about the statements in the paper and hope to receive your answers。In Table 3, you compared the differences between your method and other…

yuese1234 updated 1 week ago
3

上一页 1...1 2 3 4 5 6 7...70 下一页

695 results for visual-instruction-tuning

695 results
for visual-instruction-tuning