Special tokens:
Why does modeling_internlm_xcomposer2.py use [UNUSED_TOKEN_146]and [UNUSED_TOKEN_145], while these two are not in the special tokens? Instead, <|im_start|>and <|im_end|>.
Image 2D Structure Newline Indicator
It seems that the 'Image 2D Structure Newline Indicator' (\n) token and the separate token mentioned in the paper are not seen in the code modeling_internlm_xcomposer2.py.
LLM
InternLM-XComposer2 VL uses InternLM2-7B-Chat-SFT as LLM. What is the reason for choosing this model? Have you conducted experiments on InternLM2-Chat-7B?
we map <|im_start|> and <|im_end|> to [UNUSED_TOKEN_146] and [UNUSED_TOKEN_145] predefined in the vocabulary. Both format works equally in practice
separate is plora_glb_GN and \n is plora_sub_GN in the code, we will clarify its name in the following update.
The previous XComposer2 used the InternLM2-7B-Chat-SFT as the backbone, as the PPO version (-Chat) was not ready at that time, so we kept the backbone unchanged for the 4KHD version.
internlm/internlm-xcomposer2-4khd-7b is an excellent work, I have a few questions about it.
modeling_internlm_xcomposer2.py
use[UNUSED_TOKEN_146]
and[UNUSED_TOKEN_145]
, while these two are not in the special tokens? Instead,<|im_start|>
and<|im_end|>
.\n
) token and theseparate
token mentioned in the paper are not seen in the codemodeling_internlm_xcomposer2.py
.InternLM2-7B-Chat-SFT
as LLM. What is the reason for choosing this model? Have you conducted experiments onInternLM2-Chat-7B
?