Adds Llama-2 Chat, Mistral v0.1, Mistral v0.1 Instruct, Phi-2 LLMs. Note that these model configs match the structure of our paper (one-off changes on top of the One_Stage base configuration). All of these models can be improved by training with the "Prism" configuration (extra data, DINO + SigLIP backbones, etc.).
Evaluation Results:
VQAv2
GQA
VizWiz
TextVQA (Pure/OCR)
RefCOCO+
OCID-Ref
VSR
POPE
TallyQA
LLaVa v1.5 7B (Base)
76.54
61.58
54.25
46.13 / 58.25
49.47
35.07
51.47
86.57
62.06
Llama-2 Chat 7B
76.92
62.11
56.39
45.3 / 56.6
58.5
46.3
61.8
86.8
58.1
Llama-2 Chat 13B
78.0
63.60
56.43
57.2 / 58.4
62.9
44.9
71.4
86.8
58.9
Mistral v0.1 7B
77.30
63.30
55.32
44.4 / 49.3
65.1
48.8
58.5
87.1
61.7
Mistral Instruct v0.1 7B
77.13
62.71
54.35
44.1 / 50.5
64.9
48.0
57.8
87.5
64.5
Phi-2 3B
41.47
33.38
12.18
6.6 / 31.0
5.7
1.5
48.7
48.2
20.2
Llama-2 (Best LLM from Paper)
77.08
62.44
55.98
44.92 / 55.24
59.47
43.89
63.67
86.74
59.22
Prism DINOSigLIP 7B (Controlled)
79.05
64.16
59.82
51.78 / 58.69
67.85
50.56
66.28
88.28
65.07
Note that Phi-2 results are fairly poor; would be good to dig into this (maybe I did something wrong with the prompting scheme).
Hopefully, this PR also serves as a template for folks looking to add their own LLMs to Prismatic -- low-hanging fruit include adding Gemma, Llama-3, Phi-3 LLMs!
CC @shikhar-srivastava @zeyuanyin @Hannibal046 @RylanSchaeffer
Adds Llama-2 Chat, Mistral v0.1, Mistral v0.1 Instruct, Phi-2 LLMs. Note that these model configs match the structure of our paper (one-off changes on top of the
One_Stage
base configuration). All of these models can be improved by training with the "Prism" configuration (extra data, DINO + SigLIP backbones, etc.).Evaluation Results:
Note that Phi-2 results are fairly poor; would be good to dig into this (maybe I did something wrong with the prompting scheme).
Hopefully, this PR also serves as a template for folks looking to add their own LLMs to Prismatic -- low-hanging fruit include adding Gemma, Llama-3, Phi-3 LLMs!
CC @shikhar-srivastava @zeyuanyin @Hannibal046 @RylanSchaeffer
Resolves #6 Resolves #25