Open 2023luckyboy opened 1 year ago
=============== =======
from transformers import AutoTokenizer from mplug_owl.modeling_mplug_owl import MplugOwlForConditionalGeneration from mplug_owl.processing_mplug_owl import MplugOwlImageProcessor, MplugOwlProcessor import torch pretrained_ckpt = 'MAGAer13/mplug-owl-llama-7b-video' model = MplugOwlForConditionalGeneration.from_pretrained( pretrained_ckpt, torch_dtype=torch.bfloat16, cache_dir = './' ) Errors as follows: Traceback (most recent call last): File "/remote-home/share/VideoBenchmark/Video_Benchmark/VLLM-3metrics/mPLUG-Owl/mplug-owl_infer.py", line 9, in model = MplugOwlForConditionalGeneration.from_pretrained( File "/root/anaconda3/envs/mplug_owl/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2795, in from_pretrained ) = cls._load_pretrained_model( File "/root/anaconda3/envs/mplug_owl/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3173, in _load_pretrained_model raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}") RuntimeError: Error(s) in loading state_dict for MplugOwlForConditionalGeneration: size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]). size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]). size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]). size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]). size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]). size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]). size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]). size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]). size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
Have you solved this problem? I have the same question.
@2023luckyboy and @Shame-fight,
change from mplug_owl.modeling_mplug_owl import MplugOwlForConditionalGeneration
to from mplug_owl_video.modeling_mplug_owl import MplugOwlForConditionalGeneration
=============== ======= from transformers import AutoTokenizer from mplug_owl.modeling_mplug_owl import MplugOwlForConditionalGeneration from mplug_owl.processing_mplug_owl import MplugOwlImageProcessor, MplugOwlProcessor import torch pretrained_ckpt = 'MAGAer13/mplug-owl-llama-7b-video' model = MplugOwlForConditionalGeneration.from_pretrained( pretrained_ckpt, torch_dtype=torch.bfloat16, cache_dir = './' )
Errors as follows: Traceback (most recent call last): File "/remote-home/share/VideoBenchmark/Video_Benchmark/VLLM-3metrics/mPLUG-Owl/mplug-owl_infer.py", line 9, in
model = MplugOwlForConditionalGeneration.from_pretrained(
File "/root/anaconda3/envs/mplug_owl/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2795, in from_pretrained
) = cls._load_pretrained_model(
File "/root/anaconda3/envs/mplug_owl/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3173, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for MplugOwlForConditionalGeneration:
size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.0.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.1.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.2.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.3.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.4.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w1.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w1.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w2.weight: copying a param with shape torch.Size([1024, 2816]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w3.weight: copying a param with shape torch.Size([2816, 1024]) from checkpoint, the shape in current model is torch.Size([4096, 1024]).
size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.w3.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.ffn_ln.weight: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).
size mismatch for abstractor.encoder.layers.5.crossattention.output.mlp.ffn_ln.bias: copying a param with shape torch.Size([2816]) from checkpoint, the shape in current model is torch.Size([4096]).