Coobiw / MPP-LLaVA

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.
349 stars 19 forks source link

AI 无法理解图片 #24

Closed liuling19941216 closed 2 months ago

liuling19941216 commented 2 months ago

第一步骤:上传图片-提问 报错图片 ![Uploading 页面报错.png…]()

2、查看后台,图片显示 ![Uploading cmd.png…]()

Coobiw commented 2 months ago

你好,你的图片没有上传上来

liuling19941216 commented 2 months ago

cmd ![Uploading 页面报错.png…]()

liuling19941216 commented 2 months ago

![Uploading 页面报错.png…]()

liuling19941216 commented 2 months ago

![Uploading 效果图2.png…]()

Coobiw commented 2 months ago

我有两个问题需要确认下:

  1. 你的运行命令是
    python webui _demo. py --model-type qwen7b_chat -c lavis/output/pp_7b_video/sft_video/global_step2005/unfreeze_llm_model.pth --llm_device_map auto"

    这个checkpoint:lavis/output/pp_7b_video/sft_video/global_step2005/unfreeze_llm_model.pth用的是什么,我来没来得及上传这个checkpoint,你说下载了其他checkpoint用上了吗

2.你上传的图像是什么,我可以在我这里给你试一下,看看跑出来什么样子

Coobiw commented 2 months ago

图片几乎都fail掉了...上传的时候可以等待一下,看到出现图片再提交

liuling19941216 commented 2 months ago

1、我用的是你写的执行命令生成的 ,不是这样生成的吗 sft阶段 (需要转换projection层和所有LLM的参数)

python pipemodel2pth.py --ckpt-dir lavis/output/pp_7b_video/sft_video/global_step2005 转换后,模型文件会存储在ckpt_dir底下,名为unfreeze_llm_model.pth 2、图片用的就是你MiniGPT4Qwen-master\examples\minigpt4_image_3.jpg 这张图

liuling19941216 commented 2 months ago

页面报错 minigpt4+通义千问的效果图

Coobiw commented 2 months ago

请问你是自己训练了之后进行的权重转换吗?这个脚本是训练之后将deepspeed pipeline parallel权重转换成pth的脚本,如果没训练直接调用是不对的。

这是我刚才测试的case:

image image
Coobiw commented 2 months ago

Solved in WeChat. I'll close this issue.