Closed KooSung closed 6 months ago
@KooSung 请问你是指添加了 special tokens 后,如何保存使其生效还是?
@hhaAndroid 在训练好的xtuner llava上进一步训练,这一步添加了一些special tokens。但是使用之前的权重会出现维度不一致的情况,请问该怎么修改才能训练。
RuntimeError: Error(s) in loading state_dict for LLaVAModel:
size mismatch for llm.base_model.model.output.lora_B.default.weight: copying a param with shape torch.Size([92544, 512]) from checkpoint, the shape in current model is torch.Size([92547, 512]).
目前修改如下: 在dataset/llava.py里加了special tokens
for special_token in special_tokens:
if special_token not in tokenizer.get_vocab():
tokenizer.add_tokens([special_token], special_tokens=True)
在model/llava.py里resize
def __init__(self,
llm,
visual_encoder,
freeze_llm=False,
freeze_visual_encoder=False,
visual_select_layer=-2,
pretrained_pth=None,
projector_depth=2,
llm_lora=None,
visual_encoder_lora=None,
use_activation_checkpointing=True):
super().__init__()
self.freeze_llm = freeze_llm
self.freeze_visual_encoder = freeze_visual_encoder
with LoadWoInit():
self.llm = self._build_from_cfg_or_module(llm)
self.llm.resize_token_embeddings(92547) # resize
self.visual_encoder = self._build_from_cfg_or_module(
visual_encoder)
参考这个解决了https://github.com/InternLM/xtuner/blob/aac7f578e0a9e64129abb4c9d06659bb04e7eb19/xtuner/configs/llava/llama3_8b_instruct_clip_vit_large_p14_336/convert_xtuner_weights_to_hf.py#L86-L115
在这里加上维度变换即可https://github.com/InternLM/xtuner/blob/aac7f578e0a9e64129abb4c9d06659bb04e7eb19/xtuner/model/llava.py#L87
请问使用llava进行SFT时,如何添加special tokens呢?主要是tokenizer.add_tokens后,模型应该如何操作