X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
https://www.modelscope.cn/studios/damo/mPLUG-Owl
MIT License
2.33k stars 176 forks source link

how to use the ckpt for inference? #80

Closed yuki9965 closed 1 year ago

yuki9965 commented 1 year ago

During the training process, some ckpt files such as "checkpoint-500" will be generated. How should we use it for inference? image

MAGAer13 commented 1 year ago

The most convient way to do this is to clone the model on huggingface model hub, and replace pytorch_model.bin file.

yuki9965 commented 1 year ago

The most convient way to do this is to clone the model on huggingface model hub, and replace pytorch_model.bin file.

seems not work for me? I use the mplug-owl-llama-7b-ft to finetune and use lora, I delete the pytorch_model-00001-of-00002.bin and pytorch_model-00002-of-00002.bin of mplug-owl-llama-7b-ft and replaced with pytorch_model.bin of checkpoint-500. But it produce warning and error:

Some weights of the model checkpoint at ./ckpt_500/ were not used when initializing MplugOwlForConditionalGeneration: ['base_model.model.vision_model.encoder.layers.19.mlp.fc1.weight', 'base_model.model.vision_model.encoder.layers.0.input_layernorm.bias', 'base_model.model.language_model.model.layers.10.self_attn.q_proj.weight', 'base_model.model.language_model.model.layers.13.self_attn.v_proj.weight', 'base_model.model.language_model.model.layers.13.input_layernorm.weight', 'base_model.model.language_model.model.layers.29.self_attn.v_proj.lora_B.default.weight', 'base_model.model.language_model.model.layers.15.mlp.up_proj.weight', 'base_model.model.abstractor.encoder.layers.2.crossattention.output.norm2.weight', 'base_model.model.vision_model.encoder.layers.13.mlp.fc2.bias', 'base_model.model.vision_model.encoder.layers.17.post_attention_layernorm.weight', 'base_model.model.language_model.model.layers.27.self_attn.q_proj.weight', 'base_model.model.abstractor.encoder.layers.5.crossattention.attention.value.weight', 'base_model.model.abstractor.encoder.layers.5.crossattention.output.mlp.w3.bias', 'base_model.model.language_model.model.layers.17.mlp.gate_proj.weight', 'base_model.model.abstractor.encoder.layers.2.crossattention.output.norm2.bias', 'base_model.model.abstractor.encoder.layers.5.crossattention.output.mlp.ffn_ln.weight', 'base_model.model.language_model.model.layers.3.self_attn.k_proj.weight', 'base_model.model.vision_model.encoder.layers.9.post_attention_layernorm.bias', 'base_model.model.language_model.model.layers.17.self_attn.q_proj.lora_A.default.weight', 'base_model.model.language_model.model.layers.31.mlp.down_proj.weight', 'base_model.model.vision_model.encoder.layers.8.post_attention_layernorm.bias', 'base_model.model.vision_model.encoder.layers.5.mlp.fc1.weight', 'base_model.model.vision_model.encoder.layers.21.mlp.fc1.bias', 'base_model.model.vision_model.encoder.layers.19.self_attn.query_key_value.weight', 'base_model.model.abstractor.encoder.layers.1.crossattention.output.out_proj.weight', 'base_model.model.vision_model.encoder.layers.0.self_attn.dense.bias', 'base_model.model.language_model.model.layers.0.self_attn.v_proj.weight', 'base_model.model.language_model.model.layers.14.mlp.up_proj.weight', .......

File "/checkpoint/binary/train_package/inference_mdl.py", line 183, in run(args) File "/checkpoint/binary/train_package/inference_mdl.py", line 98, in run model = model.to(DEVICE) File "/root/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1896, in to return super().to(*args, **kwargs) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1145, in to return self._apply(convert) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 820, in _apply param_applied = fn(param) File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1143, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) NotImplementedError: Cannot copy out of meta tensor; no data!

lambertjf commented 1 year ago

I got those warnings too but it was able to perform inference, however the result was random tokens and didn't make any sense

Why did you close the issue, did you ever figure out the problem?

FuxiaoLiu commented 1 year ago

Any updates about this

FuxiaoLiu commented 1 year ago

I got those warnings too but it was able to perform inference, however the result was random tokens and didn't make any sense

Why did you close the issue, did you ever figure out the problem?

You can refer to here to solve the problem.

LianghuiGuo commented 11 months ago

here

same error! Did you solve it?