showlab / Image2Paragraph

[A toolbox for fun.] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.
Apache License 2.0
794 stars 54 forks source link

Unable to load weights from checkpoint file #29

Open andyoung009 opened 1 year ago

andyoung009 commented 1 year ago

Hi, it is a nice work. I followed the install.md to build the virtual env with scapy==3.0.0. But when I run the example with python main.py --image_src "examples/3.jpg" --out_image_name "output/3_result.jpg", there is a OSError as follow: ------This is time-consuming, please wait...------ ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /miniconda3/envs/i2p/lib/python3.8/site-packages/diffusers/models/modeling_utils.py │ │ :109 in load_state_dict │ │ │ │ 106 │ │ if os.path.basename(checkpoint_file) == _add_variant(WEIGHTS_NAME, variant): │ │ 107 │ │ │ return torch.load(checkpoint_file, map_location="cpu") │ │ 108 │ │ else: │ │ ❱ 109 │ │ │ return safetensors.torch.load_file(checkpoint_file, device="cpu") │ │ 110 │ except Exception as e: │ │ 111 │ │ try: │ │ 112 │ │ │ with open(checkpoint_file) as f: │ │ │ │ /miniconda3/envs/i2p/lib/python3.8/site-packages/safetensors/torch.py:261 in │ │ load_file │ │ │ │ 258 │ result = {} │ │ 259 │ with safe_open(filename, framework="pt", device=device) as f: │ │ 260 │ │ for k in f.keys(): │ │ ❱ 261 │ │ │ result[k] = f.get_tensor(k) │ │ 262 │ return result │ │ 263 │ │ 264 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ AttributeError: module 'torch' has no attribute 'frombuffer'

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /miniconda3/envs/i2p/lib/python3.8/site-packages/diffusers/models/modeling_utils.py │ │ :113 in load_state_dict │ │ │ │ 110 │ except Exception as e: │ │ 111 │ │ try: │ │ 112 │ │ │ with open(checkpoint_file) as f: │ │ ❱ 113 │ │ │ │ if f.read().startswith("version"): │ │ 114 │ │ │ │ │ raise OSError( │ │ 115 │ │ │ │ │ │ "You seem to have cloned a repository without having git-lfs ins │ │ 116 │ │ │ │ │ │ "git-lfs and run git lfs install followed by git lfs pull in │ │ │ │/miniconda3/envs/i2p/lib/python3.8/codecs.py:322 in decode │ │ │ │ 319 │ def decode(self, input, final=False): │ │ 320 │ │ # decode input (taking the buffer into account) │ │ 321 │ │ data = self.buffer + input │ │ ❱ 322 │ │ (result, consumed) = self._buffer_decode(data, self.errors, final) │ │ 323 │ │ # keep undecoded input until the next call │ │ 324 │ │ self.buffer = data[consumed:] │ │ 325 │ │ return result │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 0: invalid start byte

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /LOG/realman/LLM/Image2Paragraph/main.py:23 in │ │ │ │ 20 │ │ │ 21 │ args = parser.parse_args() │ │ 22 │ │ │ ❱ 23 │ processor = ImageTextTransformation(args) │ │ 24 │ generated_text = processor.image_to_text(args.image_src) │ │ 25 │ generated_image = processor.text_to_image(generated_text) │ │ 26 │ ## then text to image │ │ │ │ /LOG/realman/LLM/Image2Paragraph/models/image_text_transformation.py:24 in init │ │ │ │ 21 │ def init(self, args): │ │ 22 │ │ # Load your big model here │ │ 23 │ │ self.args = args │ │ ❱ 24 │ │ self.init_models() │ │ 25 │ │ self.ref_image = None │ │ 26 │ │ │ 27 │ def init_models(self): │ │ │ │ /LOG/realman/LLM/Image2Paragraph/models/image_text_transformation.py:38 in init_models │ │ │ │ 35 │ │ self.image_caption_model = ImageCaptioning(device=self.args.image_caption_device │ │ 36 │ │ self.dense_caption_model = DenseCaptioning(device=self.args.dense_caption_device │ │ 37 │ │ self.gpt_model = ImageToText(openai_key) │ │ ❱ 38 │ │ self.controlnet_model = TextToImage(device=self.args.contolnet_device) │ │ 39 │ │ self.region_semantic_model = RegionSemantic(device=self.args.semantic_segment_de │ │ 40 │ │ print('\033[1;32m' + "Model initialization finished!".center(50, '-') + '\033[0m │ │ 41 │ │ │ │ /LOG/realman/LLM/Image2Paragraph/models/controlnet_model.py:15 in init │ │ │ │ 12 class TextToImage: │ │ 13 │ def init(self, device): │ │ 14 │ │ self.device = device │ │ ❱ 15 │ │ self.model = self.initialize_model() │ │ 16 │ │ │ 17 │ def initialize_model(self): │ │ 18 │ │ if self.device == 'cpu': │ │ │ │ /LOG/realman/LLM/Image2Paragraph/models/controlnet_model.py:22 in initialize_model │ │ │ │ 19 │ │ │ self.data_type = torch.float32 │ │ 20 │ │ else: │ │ 21 │ │ │ self.data_type = torch.float16 │ │ ❱ 22 │ │ controlnet = ControlNetModel.from_pretrained( │ │ 23 │ │ │ "fusing/stable-diffusion-v1-5-controlnet-canny", │ │ 24 │ │ │ torch_dtype=self.data_type, │ │ 25 │ │ │ map_location=self.device, # Add this line │ │ │ │ /miniconda3/envs/i2p/lib/python3.8/site-packages/diffusers/models/modeling_utils.py │ │ :602 in from_pretrained │ │ │ │ 599 │ │ │ │ # if device_map is None, load the state dict and move the params from me │ │ 600 │ │ │ │ if device_map is None: │ │ 601 │ │ │ │ │ param_device = "cpu" │ │ ❱ 602 │ │ │ │ │ state_dict = load_state_dict(model_file, variant=variant) │ │ 603 │ │ │ │ │ model._convert_deprecated_attention_blocks(state_dict) │ │ 604 │ │ │ │ │ # move the params from meta device to cpu │ │ 605 │ │ │ │ │ missing_keys = set(model.state_dict().keys()) - set(state_dict.keys( │ │ │ │ /miniconda3/envs/i2p/lib/python3.8/site-packages/diffusers/models/modeling_utils.py │ │ :125 in load_state_dict │ │ │ │ 122 │ │ │ │ │ │ "model. Make sure you have saved the model properly." │ │ 123 │ │ │ │ │ ) from e │ │ 124 │ │ except (UnicodeDecodeError, ValueError): │ │ ❱ 125 │ │ │ raise OSError( │ │ 126 │ │ │ │ f"Unable to load weights from checkpoint file for '{checkpoint_file}' " │ │ 127 │ │ │ │ f"at '{checkpoint_file}'. " │ │ 128 │ │ │ │ "If you tried to load a PyTorch model from a TF 2.0 checkpoint, please s │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ OSError: Unable to load weights from checkpoint file for '/.cache/huggingface/hub/models--fusing--stable-diffusion-v1-5-controlnet-canny/snapshots/7f2f69197050967007f6bbd23ab5e52f0384162a/d iffusion_pytorch_model.safetensors' at '/.cache/huggingface/hub/models--fusing--stable-diffusion-v1-5-controlnet-canny/snapshots/7f2f69197050967007f6bbd23ab5e52f0384162a/d iffusion_pytorch_model.safetensors'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

In order to debug, I try to build a new virtual env following the install.sh and deleted the cache model documents and re-downloaded them again by running the main.py. But the error still happens. How can I deal with the bug? My torch version is as follows: torch 1.9.0+cu111 torchaudio 0.9.0 torchvision 0.10.0+cu111

BanyaoSaiikou commented 1 year ago

Hello I have met the same error with you. Did you find out the reason of this error? I'm using a UTF-8 Ubuntu system to run this and suspected if it is caused by it.

BanyaoSaiikou commented 1 year ago

I upgraded my Torch version to 1.10.0+cu111 and it worked

  1. pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html
andyoung009 commented 1 year ago

Thank you. I will try to slove it following your advice.@BanyaoSaiikou

DaMing-sudo commented 11 months ago

@andyoung009 Have you solved this problem