tencent-ailab / IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Apache License 2.0
4.52k stars 298 forks source link

Resampler fail to load checkpoint (tutorial_train_plus) #233

Open JackAILab opened 6 months ago

JackAILab commented 6 months ago

It seems that the pre-trained models ip-adapter-faceid-plus_sd15.bin and ip-adapter-faceid-plusv2_sd15.bin released by huggingface do not use the new structure Resampler (this is defined in IP-Adapter/tutorial_train_plus.py , Line308). Because (1) I tried to re-download and import this plusv2_sd15 model many times, both image_proj and ip_adapter failed to import, and some weight structures were not defined in the plusv2_sd15 pre-training model; (2) I checked the inference code (ipa_faceID_Plus_demo.py), ProjPlusModel( ip_adapter_faceid.py, Line279 ) is used in this code instead of the new structure Resampler. The ProjPlusModel used by the inference code is somewhat different from Resampler and the previous MLPProjModel (IP-Adapter/ip_adapter /ip_adapter_faceid.py, Line64).

I would like to ask, does the new structure Resampler have any special functions? Will your huggingface update the model pre-trained on this structure? Thank you so much for your great work! ! !

Or, maybe I use the wrong code? Can you please correct me? Why this model fails to be imported using the tutorial_train_plus script, and it cannot pass "if "latents" in state_dict["image_proj"] and "latents" in self.image_proj_model.state_dict():". Thank you very much! ! !

image

image

xiaohu2015 commented 6 months ago

you should refer to https://github.com/tencent-ailab/IP-Adapter/blob/main/ip_adapter/ip_adapter_faceid.py