baichuan-inc / Baichuan-7B

A large-scale 7B pretraining language model developed by BaiChuan-Inc.
https://huggingface.co/baichuan-inc/baichuan-7B
Apache License 2.0
5.67k stars 506 forks source link

[Question] 模型参数问题 #103

Open L-hongbin opened 1 year ago

L-hongbin commented 1 year ago

Required prerequisites

Questions

请问lm_head.weight 和 model.embed_tokens.weight 的值为什么是不一样的呢,一般在预训练中会复用model.embed_tokens.weight 来映射回词表,所以感觉一般是一样的。这里是在预训练的时候把 model.embed_tokens.weight 和 lm_head.weight 分开了吗?

Checklist