OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Apache License 2.0
12.75k stars 893 forks source link

[BUG] <在funetune/dataset.py中报告image start token != image end tokens的错误> #587

Closed KeepFaithMe closed 1 month ago

KeepFaithMe commented 2 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

在执行funetune_lora.py时,出现image start token != image end tokens的错误,我将与报错相关的变量打印出来,其结果如下:image_start_tokens: tensor([], dtype=torch.int64) image_end_tokens: tensor([66]) image_start_tokens是空,它的结果来源于这一段代码:“image_start_tokens = torch.where(ids == tokenizer.im_start_id)[0]”,也就是说ids中不存在与tokenizer.im_start_id相等的值。 ids,image_start_tokens ,image_end_tokens三个变量的结果如下图: 1726658289123 数据的组织方式如下: 1726658383833

期望行为 | Expected Behavior

解决上述bug

复现方法 | Steps To Reproduce

配置好基本环境,按照图2的数据组织方式即可复现,注意,测试数据确实只包含了一张图片,但我也测试过多张图片,也不对。

运行环境 | Environment

- OS:Ubuntu 22.04
- Python:3.10
- Transformers: 4.40.0
- PyTorch:2.1.2
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):11.8

备注 | Anything else?

No response

KeepFaithMe commented 2 months ago

补充一下,微调使用的模型是MiniCPM-V-int4

XHB-ZMM commented 2 months ago

如果你的对话中,除了第一个user有\n,同一个对话的某些user也有\n占位符,就会出现这个错误

natsoe7 commented 1 month ago

lapyae

On Wed, Oct 9, 2024, 9:23 AM qianyu chen @.***> wrote:

Closed #587 https://github.com/OpenBMB/MiniCPM-V/issues/587 as completed.

— Reply to this email directly, view it on GitHub https://github.com/OpenBMB/MiniCPM-V/issues/587#event-14565044161, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMIIOEH7TKVJCIDP56WNTATZ2SLBHAVCNFSM6AAAAABONOURHWVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJUGU3DKMBUGQYTMMI . You are receiving this because you are subscribed to this thread.Message ID: @.***>