Open shockjiang opened 5 months ago
The issue might be due to the local model not being initialized correctly.
Before loading the checkpoint, check if the model contains the key llm.model.layers.0.self_attn.o_proj.weight
.
@shockjiang Can you try this in https://github.com/InternLM/xtuner/blob/main/xtuner/configs/deepspeed/deepspeed_zero3.json ?
{"zero_optimization": {
"stage3_prefetch_bucket_size":0}
}
I try to load pretrained pth of llava: hub/llava-phi-3-mini-pth/model.pth. And I got this strange error:
any clue? thx!