OpenBMB / MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
Apache License 2.0
7.82k stars 543 forks source link

finetune lora的时候遇到这个问题Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same #318

Open XuanRen4470 opened 2 days ago

XuanRen4470 commented 2 days ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

当运行finetune lora时

在计算get_vision_embedding的vision_embedding = self.vpm.forward_features(pixel_value.unsqueeze(0).type(dtype))时报错Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same

def get_vision_embedding(self, pixel_values):
        res = []
        dtype = self.vpm.pos_embed.data.dtype
        for pixel_value in pixel_values:
            H, W = pixel_value.shape[-2:]
            tgt_size = (
            math.ceil(H / self.vpm.patch_embed.patch_size[0]), math.ceil(W / self.vpm.patch_embed.patch_size[0]))
            vision_embedding = self.vpm.forward_features(pixel_value.unsqueeze(0).type(dtype))

以下是完整的message


Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same
  File "/workspace/LLaMA-Efficient-Tuning/Llama-2-13b-chat-hf/modules/transformers_modules/openbmb/MiniCPM-V-2/3d0e971f6ef6e2029f71202f98f9ba886925f2e4/modeling_minicpmv.py", line 82, in get_vision_embedding
    vision_embedding = self.vpm.forward_features(pixel_value.unsqueeze(0).type(dtype))
  File "/workspace/LLaMA-Efficient-Tuning/Llama-2-13b-chat-hf/modules/transformers_modules/openbmb/MiniCPM-V-2/3d0e971f6ef6e2029f71202f98f9ba886925f2e4/modeling_minicpmv.py", line 94, in get_vllm_embedding
    vision_hidden_states.append(self.get_vision_embedding(pixel_values))
  File "/workspace/LLaMA-Efficient-Tuning/Llama-2-13b-chat-hf/modules/transformers_modules/openbmb/MiniCPM-V-2/3d0e971f6ef6e2029f71202f98f9ba886925f2e4/modeling_minicpmv.py", line 141, in forward
    vllm_embedding, vision_hidden_states = self.get_vllm_embedding(data)
  File "/workspace/MiniCPM-V-main/finetune/trainer.py", line 30, in compute_loss
    outputs = self.model.base_model(data = inputs, use_cache=False)
  File "/workspace/MiniCPM-V-main/finetune/trainer.py", line 206, in training_step
    loss = self.compute_loss(model, inputs)
  File "/workspace/MiniCPM-V-main/finetune/finetune.py", line 318, in train
    trainer.train()
  File "/workspace/MiniCPM-V-main/finetune/finetune.py", line 328, in <module>
    train()
RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same

有人知道怎么回事嘛

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

No response