Closed dragen1860 closed 5 months ago
Dear all: why stage 1 and 2 use different --version plain_guided --version imgsp_v1 parameters? thank you.
--version plain_guided
--version imgsp_v1
Hi, because in stage 1, we do not append instructions before (or after) image tokens to LLM following that in LLaVA. In stage 2, we append user instructions in each conversation turn.
Dear all: why stage 1 and 2 use different
--version plain_guided
--version imgsp_v1
parameters? thank you.