long8v / PTIR

Paper Today I Read
19 stars 0 forks source link

[140] Improved Baselines with Visual Instruction Tuning #152

Open long8v opened 6 months ago

long8v commented 6 months ago
image

paper

see llava https://github.com/long8v/PTIR/issues/128#issue-1749571159 here

TL;DR

Details

contribution

image

최소한의 tuning(1.2M scale의 public data로 8 A100 days로 끝나는)으로 좋은 성능

Dataset

Improved baseline of LLaVA

image

Result / Ability

image

LLaVA는 이상하게 대답

image
long8v commented 4 months ago

LLaVA-NeXT https://llava-vl.github.io/blog/2024-01-30-llava-next/