haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://llava.hliu.cc
Apache License 2.0
17.84k stars 1.93k forks source link

[Usage] Llava-Gemma Pretraining + Fine tuning Usage issue and missing Fine tuned projector.bin #1548

Open nlpkiddo-2001 opened 3 weeks ago

nlpkiddo-2001 commented 3 weeks ago

Describe the issue

Issue: I first pretrained the projector using Clip + Gemma Model and then FIne tuned the Gemma and Projector, but no matter what It is giving in correct outputs, and the loss is revolving around 1-2 in pretraining for projector and 0.4 - 0.7 in fine tuning. I tried without Lora.

Screenshots:

Screenshot 2024-06-07 at 9 23 52 PM

Kindly assist me. I have a similar setup for Gemma as like in this PR . https://github.com/haotian-liu/LLaVA/pull/1247

Screenshot of fine tuning from wandb

Screenshot 2024-06-07 at 9 25 48 PM