tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Apache License 2.0
5.08k stars 331 forks source link

How to get "ip_adapter" in pytorch_model.bin after training(fine tune) #246

Closed GoFuuu closed 8 months ago

GoFuuu commented 8 months ago

I used custom model to do the fine tune (tutorial_train_faceid), For saved checkpoint , It contains only four files (model.safetensors、optimizer.bin、random_states.pkl 、scaler.pt) and does not have pytorch_model.bin,how can i convert the weights to {"image_proj": image_proj_sd, "ip_adapter": ip_sd}. I tried to load model.safetensors and It doesn't have a dict that starts with "adapter_modules".

Bluewind001 commented 8 months ago

same issue

xiaohu2015 commented 8 months ago

refer to https://github.com/tencent-ailab/IP-Adapter/issues/172

Bluewind001 commented 8 months ago

Thanks, I tried the code of scarbain , got the new 'adapter.bin'. When I use this in the class of "IPAdapterFaceID", get the error of missing key and size mismatch. The different is what I use FaceID but she use in SDXL.

Traceback (most recent call last):
  File "script.py", line 372, in <module>
    gen_t2i_img()
  File "script.py", line 209, in gen_t2i_img
    ip_model = IPAdapterFaceID(pipe, ip_ckpt, device)
  File "/cfs_train_data/qingfeng/sd_face_project/IP-Adapter/ip_adapter/ip_adapter_faceid.py", line 134, in __init__
    self.load_ip_adapter()
  File "/cfs_train_data/qingfeng/sd_face_project/IP-Adapter/ip_adapter/ip_adapter_faceid.py", line 180, in load_ip_adapter
    ip_layers.load_state_dict(state_dict["ip_adapter"])
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ModuleList:
        Missing key(s) in state_dict: "0.to_q_lora.down.weight", "0.to_q_lora.up.weight", "0.to_k_lora.down.weight", "0.to_k_lora.up.weight", "0.to_v_lora.down.weight", "0.to_v_lora.up.weight", "0.to_out_lora.down.weight", "0.to_out_lora.up.weight", "1.to_q_lora.down.weight", "1.to_q_lora.up.weight", "1.to_k_lora.down.weight", "1.to_k_lora.up.weight", "1.to_v_lora.down.weight", "1.to_v_lora.up.weight", "1.to_out_lora.down.weight", "1.to_out_lora.up.weight", "2.to_q_lora.down.weight", "2.to_q_lora.up.weight", "2.to_k_lora.down.weight", "2.to_k_lora.up.weight", "2.to_v_lora.down.weight", "2.to_v_lora.up.weight", "2.to_out_lora.down.weight", "2.to_out_lora.up.weight", "3.to_q_lora.down.weight", "3.to_q_lora.up.weight", "3.to_k_lora.down.weight", "3.to_k_lora.up.weight", "3.to_v_lora.down.weight", "3.to_v_lora.up.weight", "3.to_out_lora.down.weight", "3.to_out_lora.up.weight", "4.to_q_lora.down.weight", "4.to_q_lora.up.weight", "4.to_k_lora.down.weight", "4.to_k_lora.up.weight", "4.to_v_lora.down.weight", "4.to_v_lora.up.weight", "4.to_out_lora.down.weight", "4.to_out_lora.up.weight", "5.to_q_lora.down.weight", "5.to_q_lora.up.weight", "5.to_k_lora.down.weight", "5.to_k_lora.up.weight", "5.to_v_lora.down.weight", "5.to_v_lora.up.weight", "5.to_out_lora.down.weight", "5.to_out_lora.up.weight", "6.to_q_lora.down.weight", "6.to_q_lora.up.weight", "6.to_k_lora.down.weight", "6.to_k_lora.up.weight", "6.to_v_lora.down.weight", "6.to_v_lora.up.weight", "6.to_out_lora.down.weight", "6.to_out_lora.up.weight", "7.to_q_lora.down.weight", "7.to_q_lora.up.weight", "7.to_k_lora.down.weight", "7.to_k_lora.up.weight", "7.to_v_lora.down.weight", "7.to_v_lora.up.weight", "7.to_out_lora.down.weight", "7.to_out_lora.up.weight", "8.to_q_lora.down.weight", "8.to_q_lora.up.weight", "8.to_k_lora.down.weight", "8.to_k_lora.up.weight", "8.to_v_lora.down.weight", "8.to_v_lora.up.weight", "8.to_out_lora.down.weight", "8.to_out_lora.up.weight", "9.to_q_lora.down.weight", "9.to_q_lora.up.weight", "9.to_k_lora.down.weight", "9.to_k_lora.up.weight", "9.to_v_lora.d
ght", "31.to_v_lora.down.weight", "31.to_v_lora.up.weight", "31.to_out_lora.down.weight", "31.to_out_lora.up.weight". 
        size mismatch for 1.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 1.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 3.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 3.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 5.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 5.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 9.to_k_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 9.to_v_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 11.to_k_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 11.to_v_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 13.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 13.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 15.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 15.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 17.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 17.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 19.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 19.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 21.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 21.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 23.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 23.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 25.to_k_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 25.to_v_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 27.to_k_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 27.to_v_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 29.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 29.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
GoFuuu commented 8 months ago

Thanks, I tried the code of scarbain , got the new 'adapter.bin'. When I use this in the class of "IPAdapterFaceID", get the error of missing key and size mismatch. The different is what I use FaceID but she use in SDXL.

Traceback (most recent call last):
  File "script.py", line 372, in <module>
    gen_t2i_img()
  File "script.py", line 209, in gen_t2i_img
    ip_model = IPAdapterFaceID(pipe, ip_ckpt, device)
  File "/cfs_train_data/qingfeng/sd_face_project/IP-Adapter/ip_adapter/ip_adapter_faceid.py", line 134, in __init__
    self.load_ip_adapter()
  File "/cfs_train_data/qingfeng/sd_face_project/IP-Adapter/ip_adapter/ip_adapter_faceid.py", line 180, in load_ip_adapter
    ip_layers.load_state_dict(state_dict["ip_adapter"])
  File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1497, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ModuleList:
        Missing key(s) in state_dict: "0.to_q_lora.down.weight", "0.to_q_lora.up.weight", "0.to_k_lora.down.weight", "0.to_k_lora.up.weight", "0.to_v_lora.down.weight", "0.to_v_lora.up.weight", "0.to_out_lora.down.weight", "0.to_out_lora.up.weight", "1.to_q_lora.down.weight", "1.to_q_lora.up.weight", "1.to_k_lora.down.weight", "1.to_k_lora.up.weight", "1.to_v_lora.down.weight", "1.to_v_lora.up.weight", "1.to_out_lora.down.weight", "1.to_out_lora.up.weight", "2.to_q_lora.down.weight", "2.to_q_lora.up.weight", "2.to_k_lora.down.weight", "2.to_k_lora.up.weight", "2.to_v_lora.down.weight", "2.to_v_lora.up.weight", "2.to_out_lora.down.weight", "2.to_out_lora.up.weight", "3.to_q_lora.down.weight", "3.to_q_lora.up.weight", "3.to_k_lora.down.weight", "3.to_k_lora.up.weight", "3.to_v_lora.down.weight", "3.to_v_lora.up.weight", "3.to_out_lora.down.weight", "3.to_out_lora.up.weight", "4.to_q_lora.down.weight", "4.to_q_lora.up.weight", "4.to_k_lora.down.weight", "4.to_k_lora.up.weight", "4.to_v_lora.down.weight", "4.to_v_lora.up.weight", "4.to_out_lora.down.weight", "4.to_out_lora.up.weight", "5.to_q_lora.down.weight", "5.to_q_lora.up.weight", "5.to_k_lora.down.weight", "5.to_k_lora.up.weight", "5.to_v_lora.down.weight", "5.to_v_lora.up.weight", "5.to_out_lora.down.weight", "5.to_out_lora.up.weight", "6.to_q_lora.down.weight", "6.to_q_lora.up.weight", "6.to_k_lora.down.weight", "6.to_k_lora.up.weight", "6.to_v_lora.down.weight", "6.to_v_lora.up.weight", "6.to_out_lora.down.weight", "6.to_out_lora.up.weight", "7.to_q_lora.down.weight", "7.to_q_lora.up.weight", "7.to_k_lora.down.weight", "7.to_k_lora.up.weight", "7.to_v_lora.down.weight", "7.to_v_lora.up.weight", "7.to_out_lora.down.weight", "7.to_out_lora.up.weight", "8.to_q_lora.down.weight", "8.to_q_lora.up.weight", "8.to_k_lora.down.weight", "8.to_k_lora.up.weight", "8.to_v_lora.down.weight", "8.to_v_lora.up.weight", "8.to_out_lora.down.weight", "8.to_out_lora.up.weight", "9.to_q_lora.down.weight", "9.to_q_lora.up.weight", "9.to_k_lora.down.weight", "9.to_k_lora.up.weight", "9.to_v_lora.d
ght", "31.to_v_lora.down.weight", "31.to_v_lora.up.weight", "31.to_out_lora.down.weight", "31.to_out_lora.up.weight". 
        size mismatch for 1.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 1.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 3.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 3.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 5.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 5.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 9.to_k_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 9.to_v_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 11.to_k_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 11.to_v_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 13.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 13.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 15.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 15.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 17.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 17.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([1280, 768]).
        size mismatch for 19.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 19.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 21.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 21.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 23.to_k_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 23.to_v_ip.weight: copying a param with shape torch.Size([320, 768]) from checkpoint, the shape in current model is torch.Size([640, 768]).
        size mismatch for 25.to_k_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 25.to_v_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 27.to_k_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 27.to_v_ip.weight: copying a param with shape torch.Size([640, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 29.to_k_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).
        size mismatch for 29.to_v_ip.weight: copying a param with shape torch.Size([1280, 768]) from checkpoint, the shape in current model is torch.Size([320, 768]).

same issue, and I found that the key "ip_adapter" obtained from safetensor by this code has only 32 keys, while the key "ip_adapter" in ip-adapter-faceid_sd15.bin from huggingface has 288 keys.

Bluewind001 commented 8 months ago

The reason of this error happen in the package of accelerator. I find this solution, with [GabPrato] https://github.com/huggingface/transformers/issues/27293

Calling accelerator.save_state(dir, safe_serialization=False) works

GoFuuu commented 8 months ago

The reason of this error happen in the package of accelerator. I find this solution, with [GabPrato] huggingface/transformers#27293

Calling accelerator.save_state(dir, safe_serialization=False) works

Thanks a lot, it works

hepytobecool commented 2 months ago

The reason of this error happen in the package of accelerator. I find this solution, with [GabPrato] huggingface/transformers#27293

Calling accelerator.save_state(dir, safe_serialization=False) works

I did this after i trained a sdxl_faceid version, , but i have to change ip_layers.load_state_dict(state_dict["ip_adapter"]) to ip_layers.load_state_dict(state_dict["ip_adapter"], False), otherwise it occures: `RuntimeError: Error(s) in loading state_dict for ModuleList: Missing key(s) in state_dict: "0.to_q_lora.down.weight", "0.to_q_lora.up.weight", "0.to_k_lora.down.weight", ........ "31.to_v_lora.down.weight", "31.to_v_lora.up.weight", "31.to_out_lora.down.weight", "31.to_out_lora.up.weight".

do you have the same issue ? i am not sure whether ip_layers.load_state_dict(state_dict["ip_adapter"], False) is right .

hosjiu1702 commented 2 months ago

I think you should update the accelerator.save_state() with the argument safe_serialization=False by default to avoid this issue in the near future until Accelerate's team fixes this. @xiaohu2015