OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
https://internvl.readthedocs.io/en/latest/
MIT License
6.02k stars 467 forks source link

InternVLChatModel中nn.init初始化的权重存在问题 #574

Closed galactic123 closed 2 months ago

galactic123 commented 2 months ago

在internvl_chat/internvl/model/internvl_chat/modeling_internvl_chat.py 文件中,如果使用nn.init函数来初始化权重,这个权重就会出现问题。 `
val=torch.empty(config.llm_config.hidden_size,config.llm_config.hidden_size)

nn.init.normal_(val) ` 结果就会变成: tensor([[-2.0000e+00, 1.1200e+04, -5.1200e+02, 1.1200e+04], [-1.3107e+05, 1.1200e+04, -3.3554e+07, 1.1200e+04], [-8.5899e+09, 1.1200e+04, -2.1990e+12, 1.1200e+04], ..., [-7.1054e-15, 1.5296e+04, -1.8190e-12, 1.5296e+04], [-4.6566e-10, 1.5296e+04, -1.1921e-07, 1.5296e+04], [-3.0518e-05, 1.5296e+04, -7.8125e-03, 1.5296e+04]], requires_grad=True)

Weiyun1025 commented 2 months ago

这个是因为 from_pretrained 初始化模型的时候会禁用掉初始化函数,只从权重中读取参数,您可以在模型初始化完之后在外部重新初始化一下这个参数