OpenBMB / MiniCPM

MiniCPM-2B: An end-side LLM outperforming Llama2-13B.
Apache License 2.0
4.38k stars 313 forks source link

为什么都是minicpm,一个是40层decoder,一个却是52层 #128

Closed CHNtentes closed 1 month ago

CHNtentes commented 2 months ago

https://huggingface.co/openbmb/MiniCPM-2B-128k 这里下载的模型是40层,但是安卓apk安装的模型却显示有52层

./ndarray-cache.json: "name": "model.layers.51.input_layernorm.weight", ./ndarray-cache.json: "name": "model.layers.51.self_attn.qkv_proj.q_weight", ./ndarray-cache.json: "name": "model.layers.51.self_attn.qkv_proj.q_scale", ./ndarray-cache.json: "name": "model.layers.51.self_attn.qkv_proj.q_zero", ./ndarray-cache.json: "name": "model.layers.51.self_attn.o_proj.q_weight", ./ndarray-cache.json: "name": "model.layers.51.self_attn.o_proj.q_scale", ./ndarray-cache.json: "name": "model.layers.51.self_attn.o_proj.q_zero", ./ndarray-cache.json: "name": "model.layers.51.post_attention_layernorm.weight", ./ndarray-cache.json: "name": "model.layers.51.mlp.gate_up_proj.q_weight", ./ndarray-cache.json: "name": "model.layers.51.mlp.gate_up_proj.q_scale", ./ndarray-cache.json: "name": "model.layers.51.mlp.gate_up_proj.q_zero", ./ndarray-cache.json: "name": "model.layers.51.mlp.down_proj.q_weight", ./ndarray-cache.json: "name": "model.layers.51.mlp.down_proj.q_scale", ./ndarray-cache.json: "name": "model.layers.51.mlp.down_proj.q_zero",

Achazwl commented 1 month ago

APK 2.0 corresponds to MiniCPM-1B, while APK 1.0 corresponds to MiniCPM-2B (not 128k version)