Closed yuki9965 closed 1 year ago
The most convient way to do this is to clone the model on huggingface model hub, and replace pytorch_model.bin
file.
The most convient way to do this is to clone the model on huggingface model hub, and replace
pytorch_model.bin
file.
seems not work for me? I use the mplug-owl-llama-7b-ft to finetune and use lora, I delete the pytorch_model-00001-of-00002.bin and pytorch_model-00002-of-00002.bin of mplug-owl-llama-7b-ft and replaced with pytorch_model.bin of checkpoint-500. But it produce warning and error:
Some weights of the model checkpoint at ./ckpt_500/ were not used when initializing MplugOwlForConditionalGeneration: ['base_model.model.vision_model.encoder.layers.19.mlp.fc1.weight', 'base_model.model.vision_model.encoder.layers.0.input_layernorm.bias', 'base_model.model.language_model.model.layers.10.self_attn.q_proj.weight', 'base_model.model.language_model.model.layers.13.self_attn.v_proj.weight', 'base_model.model.language_model.model.layers.13.input_layernorm.weight', 'base_model.model.language_model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.language_model.model.layers.15.mlp.up_proj.weight', 'base_model.model.abstractor.encoder.layers.2.crossattention.output.norm2.weight', 'base_model.model.vision_model.encoder.layers.13.mlp.fc2.bias', 'base_model.model.vision_model.encoder.layers.17.post_attention_layernorm.weight', 'base_model.model.language_model.model.layers.27.self_attn.q_proj.weight', 'base_model.model.abstractor.encoder.layers.5.crossattention.attention.value.weight', 'base_model.model.abstractor.encoder.layers.5.crossattention.output.mlp.w3.bias', 'base_model.model.language_model.model.layers.17.mlp.gate_proj.weight', 'base_model.model.abstractor.encoder.layers.2.crossattention.output.norm2.bias', 'base_model.model.abstractor.encoder.layers.5.crossattention.output.mlp.ffn_ln.weight', 'base_model.model.language_model.model.layers.3.self_attn.k_proj.weight', 'base_model.model.vision_model.encoder.layers.9.post_attention_layernorm.bias', 'base_model.model.language_model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.language_model.model.layers.31.mlp.down_proj.weight', 'base_model.model.vision_model.encoder.layers.8.post_attention_layernorm.bias', 'base_model.model.vision_model.encoder.layers.5.mlp.fc1.weight', 'base_model.model.vision_model.encoder.layers.21.mlp.fc1.bias', 'base_model.model.vision_model.encoder.layers.19.self_attn.query_key_value.weight', 'base_model.model.abstractor.encoder.layers.1.crossattention.output.out_proj.weight', 'base_model.model.vision_model.encoder.layers.0.self_attn.dense.bias', 'base_model.model.language_model.model.layers.0.self_attn.v_proj.weight', 'base_model.model.language_model.model.layers.14.mlp.up_proj.weight', .......
File "/checkpoint/binary/train_package/inference_mdl.py", line 183, in
I got those warnings too but it was able to perform inference, however the result was random tokens and didn't make any sense
Why did you close the issue, did you ever figure out the problem?
Any updates about this
I got those warnings too but it was able to perform inference, however the result was random tokens and didn't make any sense
Why did you close the issue, did you ever figure out the problem?
You can refer to here to solve the problem.
here
same error! Did you solve it?
During the training process, some ckpt files such as "checkpoint-500" will be generated. How should we use it for inference?