Open simonwang517 opened 3 months ago
When I add following code in script train.py (function init_model) to count LLM parameters
params_llm = 0 for k,v in llm_model.named_parameters(): print(k, v.shape, v.numel()) params_llm += v.numel() print('llm total params: {:.2f}M'.format( params_llm / 1024 / 1024))
I always get
transformer.wte.weight torch.Size([0]) 0 transformer.h.0.ln_1.weight torch.Size([0]) 0 transformer.h.0.attn.c_attn.weight torch.Size([0]) 0 transformer.h.0.attn.c_attn.bias torch.Size([0]) 0 transformer.h.0.attn.c_proj.weight torch.Size([0]) 0 transformer.h.0.ln_2.weight torch.Size([0]) 0 transformer.h.0.mlp.w1.weight torch.Size([0]) 0 transformer.h.0.mlp.w2.weight torch.Size([0]) 0 transformer.h.0.mlp.c_proj.weight torch.Size([0]) 0 transformer.h.1.ln_1.weight torch.Size([0]) 0 transformer.h.1.attn.c_attn.weight torch.Size([0]) 0 transformer.h.1.attn.c_attn.bias torch.Size([0]) 0 transformer.h.1.attn.c_proj.weight torch.Size([0]) 0 ... ...
Is it be normal? or will that affect model training?
I'm not sure, where did you put the code?
When I add following code in script train.py (function init_model) to count LLM parameters
I always get
Is it be normal? or will that affect model training?