mbzuai-oryx / GeoChat

[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
https://mbzuai-oryx.github.io/GeoChat
448 stars 36 forks source link

Trying to set a tensor of shape torch.Size([577, 1024]) in "weight" (which has shape torch.Size([1297, 1024])), this look incorrect. #53

Open zzingok opened 3 months ago

zzingok commented 3 months ago

Loading checkpoint shards: 50%|█████████████████████████████████████████████████████████████████████████████████████████ | 1/2 [00:03<00:03, 3.53s/it] Traceback (most recent call last): File "/home/zzj/GeoChat/geochat_demo.py", line 54, in tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, args.load_8bit, args.load_4bit, device=args.device) File "/home/zzj/GeoChat/geochat/model/builder.py", line 104, in load_pretrained_model model = GeoChatLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, kwargs) File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3260, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 717, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, set_module_kwargs) File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 358, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([577, 1024]) in "weight" (which has shape torch.Size([1297, 1024])), this look incorrect.

JimmyMa99 commented 2 months ago

Same too.

JimmyMa99 commented 2 months ago

I have been solved the problem, but I think it is not the best way. Maybe you need to add ignore_mismatched_sizes=True, in geochat/train/train.py#L802

then you will get the next error

RuntimeError: Error(s) in loading state_dict for Sequential:
        Missing key(s) in state_dict: "0.weight", "0.bias", "2.weight", "2.bias". 
hshdbbxjwi commented 2 months ago

hello, do you fixed this error?

JimmyMa99 commented 2 months ago

hello, do you fixed this error?

Hey, I think it is necessary to use the same checkpoint to solve this problem. I try to extract the projector from the finally checkpoint, then it work. But I found that author say they use a different VIT, you can see here.

Hope that helps.

ZhanYang-nwpu commented 2 months ago

Loading checkpoint shards: 50%|█████████████████████████████████████████████████████████████████████████████████████████ | 1/2 [00:03<00:03, 3.53s/it] Traceback (most recent call last): File "/home/zzj/GeoChat/geochat_demo.py", line 54, in tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, args.load_8bit, args.load_4bit, device=args.device) File "/home/zzj/GeoChat/geochat/model/builder.py", line 104, in load_pretrained_model model = GeoChatLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, kwargs) File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3260, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 717, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, set_module_kwargs) File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 358, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([577, 1024]) in "weight" (which has shape torch.Size([1297, 1024])), this look incorrect.

Same too. Do you have a good solution?

hshdbbxjwi commented 2 months ago

hello, do you fixed this error?

Hey, I think it is necessary to use the same checkpoint to solve this problem. I try to extract the projector from the finally checkpoint, then it work. But I found that author say they use a different VIT, you can see here.

Hope that helps.

Thank you

hshdbbxjwi commented 2 months ago

Loading checkpoint shards: 50%|█████████████████████████████████████████████████████████████████████████████████████████ | 1/2 [00:03<00:03, 3.53s/it] Traceback (most recent call last): File "/home/zzj/GeoChat/geochat_demo.py", line 54, in tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, args.load_8bit, args.load_4bit, device=args.device) File "/home/zzj/GeoChat/geochat/model/builder.py", line 104, in load_pretrained_model model = GeoChatLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, kwargs) File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3260, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 717, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, set_module_kwargs) File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 358, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([577, 1024]) in "weight" (which has shape torch.Size([1297, 1024])), this look incorrect.

Same too. Do you have a good solution?

建议你像readme.md中说的那样,先git pull ,然后把从huggingface下载的权重文件放在类似weights/geochat这样结构的文件夹中,应该就可以了

ZhanYang-nwpu commented 2 months ago

Loading checkpoint shards: 50%|█████████████████████████████████████████████████████████████████████████████████████████ | 1/2 [00:03<00:03, 3.53s/it] Traceback (most recent call last): File "/home/zzj/GeoChat/geochat_demo.py", line 54, in tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, args.load_8bit, args.load_4bit, device=args.device) File "/home/zzj/GeoChat/geochat/model/builder.py", line 104, in load_pretrained_model model = GeoChatLlamaForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, kwargs) File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained ) = cls._load_pretrained_model( File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3260, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/transformers/modeling_utils.py", line 717, in _load_state_dict_into_meta_model set_module_tensor_to_device(model, param_name, param_device, set_module_kwargs) File "/home/ps/anaconda3/envs/geochat/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 358, in set_module_tensor_to_device raise ValueError( ValueError: Trying to set a tensor of shape torch.Size([577, 1024]) in "weight" (which has shape torch.Size([1297, 1024])), this look incorrect.

Same too. Do you have a good solution?

建议你像readme.md中说的那样,先git pull ,然后把从huggingface下载的权重文件放在类似weights/geochat这样结构的文件夹中,应该就可以了

奇怪了,我确实是按照这个步骤进行的。只不过我权重存储在 GeoChat-7B/ 路径

还有一个问题就是我的 clip-vit-large-patch14-336 权重是离线下载好了,重新定义了self.vision_tower_name 这个权重路径,直接离线 load 的。就只有这个步骤不一样,而源代码应该是直接自动在 HF 在线load的,该不会是这个导致的问题吧?

bubblebrow commented 1 month ago

Same too.

Hello, could you tell me how to specifically resolve this error?