llava special tokens - Githubissues

KooSung commented 6 months ago

请问使用llava进行SFT时，如何添加special tokens呢？主要是tokenizer.add_tokens后，模型应该如何操作

hhaAndroid commented 6 months ago

@KooSung 请问你是指添加了 special tokens 后，如何保存使其生效还是？

KooSung commented 6 months ago

@hhaAndroid 在训练好的xtuner llava上进一步训练，这一步添加了一些special tokens。但是使用之前的权重会出现维度不一致的情况，请问该怎么修改才能训练。

RuntimeError: Error(s) in loading state_dict for LLaVAModel:
        size mismatch for llm.base_model.model.output.lora_B.default.weight: copying a param with shape torch.Size([92544, 512]) from checkpoint, the shape in current model is torch.Size([92547, 512]).

目前修改如下：在dataset/llava.py里加了special tokens

        for special_token in special_tokens:
            if special_token not in tokenizer.get_vocab():
                tokenizer.add_tokens([special_token], special_tokens=True)

在model/llava.py里resize

    def __init__(self,
                 llm,
                 visual_encoder,
                 freeze_llm=False,
                 freeze_visual_encoder=False,
                 visual_select_layer=-2,
                 pretrained_pth=None,
                 projector_depth=2,
                 llm_lora=None,
                 visual_encoder_lora=None,
                 use_activation_checkpointing=True):
        super().__init__()
        self.freeze_llm = freeze_llm
        self.freeze_visual_encoder = freeze_visual_encoder
        with LoadWoInit():
            self.llm = self._build_from_cfg_or_module(llm)
            self.llm.resize_token_embeddings(92547) # resize
            self.visual_encoder = self._build_from_cfg_or_module(
                visual_encoder)

KooSung commented 6 months ago

参考这个解决了https://github.com/InternLM/xtuner/blob/aac7f578e0a9e64129abb4c9d06659bb04e7eb19/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/convert_xtuner_weights_to_hf.py#L86-L115

在这里加上维度变换即可https://github.com/InternLM/xtuner/blob/aac7f578e0a9e64129abb4c9d06659bb04e7eb19/xtuner/model/llava.py#L87

InternLM / xtuner

llava special tokens #630