mbzuai-oryx / GeoChat

[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing
https://mbzuai-oryx.github.io/GeoChat
356 stars 23 forks source link

python geochat_demo.py --model-path *** error #16

Closed GH-W5 closed 3 months ago

GH-W5 commented 3 months ago

Initializing Chat Traceback (most recent call last): File "/disk_sda/**/llava_project/GeoChat/geochat_demo.py", line 53, in <module> tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, args.load_8bit, args.load_4bit, device=args.device) File "/disk_sda/**/llava_project/GeoChat/geochat/model/builder.py", line 124, in load_pretrained_model model = AutoModelForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, **kwargs) File "/home/**/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained return model_class.from_pretrained( File "/home/**/anaconda3/envs/llava/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2700, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/disk_sda/**/llava_project/GeoChat/geochat/model/language_model/geochat_llama.py", line 46, in __init__ self.model = GeoChatLlamaModel(config) File "/disk_sda/**/llava_project/GeoChat/geochat/model/language_model/geochat_llama.py", line 38, in __init__ super(GeoChatLlamaModel, self).__init__(config) File "/disk_sda/**/llava_project/GeoChat/geochat/model/geochat_arch.py", line 33, in __init__ self.vision_tower = build_vision_tower(config, delay_load=True) File "/disk_sda/**/llava_project/GeoChat/geochat/model/multimodal_encoder/builder.py", line 9, in build_vision_tower return CLIPVisionTower(vision_tower, args=vision_tower_cfg, **kwargs) File "/disk_sda/**/llava_project/GeoChat/geochat/model/multimodal_encoder/clip_encoder.py", line 88, in __init__ self.clip_interpolate_embeddings(image_size=504, patch_size=14) File "/disk_sda/**/llava_project/GeoChat/geochat/model/multimodal_encoder/clip_encoder.py", line 25, in clip_interpolate_embeddings state_dict = self.vision_tower.vision_model.embeddings.position_embedding.state_dict() File "/home/**/anaconda3/envs/llava/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__ raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'CLIPVisionTower' object has no attribute 'vision_tower'. Did you mean: 'vision_tower_name'?

trongthuan205 commented 3 months ago

How do you fix this problem?

GH-W5 commented 3 months ago

13