Kwai-Kolors / Kolors

Kolors Team
Apache License 2.0
3.91k stars 276 forks source link

FaceID Error: RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTensor) should be the same #111

Open zhihui96 opened 3 months ago

zhihui96 commented 3 months ago

有人在跑FaceID Adapter的时候遇到下面的问题吗?

Traceback (most recent call last):
  File "/lustre/wzh/git_repo/Kolors/ipadapter_FaceID/sample_ipadapter_faceid_plus.py", line 121, in <module>
    fire.Fire(infer)
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 143, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 477, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/usr/local/lib/python3.10/dist-packages/fire/core.py", line 693, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/lustre/wzh/git_repo/Kolors/ipadapter_FaceID/sample_ipadapter_faceid_plus.py", line 105, in infer
    image = pipe(
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/kolors-0.1-py3.10.egg/kolors/pipelines/pipeline_stable_diffusion_xl_chatglm_256_ipadapter_FaceID.py", line 827, in __call__
    image_prompt_embeds, uncond_image_prompt_embeds = self.get_fused_face_embedds(
  File "/usr/local/lib/python3.10/dist-packages/kolors-0.1-py3.10.egg/kolors/pipelines/pipeline_stable_diffusion_xl_chatglm_256_ipadapter_FaceID.py", line 630, in get_fused_face_embedds
    face_clip_embeds = self.get_clip_feat(face_crop_image, device)
  File "/usr/local/lib/python3.10/dist-packages/kolors-0.1-py3.10.egg/kolors/pipelines/pipeline_stable_diffusion_xl_chatglm_256_ipadapter_FaceID.py", line 622, in get_clip_feat
    face_clip_embeddings = self.face_clip_encoder(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/clip/modeling_clip.py", line 1304, in forward
    vision_outputs = self.vision_model(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/clip/modeling_clip.py", line 859, in forward
    hidden_states = self.embeddings(pixel_values)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/clip/modeling_clip.py", line 195, in forward
    patch_embeds = self.patch_embedding(pixel_values)  # shape = [*, width, grid, grid]
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.cuda.FloatTensor) should be the same
zhihui96 commented 3 months ago

ipadapter_FaceID/sample_ipadapter_faceid_plus.py line68

clip_image_encoder = CLIPVisionModelWithProjection.from_pretrained(f'{ip_model_dir}/clip-vit-large-patch14-336', ignore_mismatched_sizes=True)

改为了

clip_image_encoder = CLIPVisionModelWithProjection.from_pretrained(f'{ip_model_dir}/clip-vit-large-patch14-336', ignore_mismatched_sizes=True).to(dtype=torch.float16)