部署成功了，但是识别图片结果为什么看起来很离谱

Vision-CAIR / MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

https://minigpt-4.github.io

BSD 3-Clause "New" or "Revised" License

25.35k stars 2.91k forks source link

部署成功了，但是识别图片结果为什么看起来很离谱 #146

Open kimiller opened 1 year ago

kimiller commented 1 year ago

识别结果为什么这么离谱呢，是一个小白按着教程部署的，不知道哪里有问题

SnakeHacker commented 1 year ago

silent780 commented 1 year ago

Same issue, Web demo did a good job, but my local model is just look like a shit, my card is 4090 and model is 13B. Is this just normal or something wrong with my project?

Korner83 commented 1 year ago

You might loaded in the wrong weights, I would advise to double check the settings in the yml files to the proper models. Have you followed the steps in the description?

kimiller commented 1 year ago

I didn't follow steps as in the description. I have used below cmd to deploy it into Google Colab, actually I don't know what happened in the cmd,I think I need redeploy it and use another method cmd like this ：

!git clone -b dev https://github.com/camenduru/minigpt4 !wget https://huggingface.co/ckpt/minigpt4/resolve/main/minigpt4.pth -O /content/minigpt4/checkpoint.pth !wget https://huggingface.co/ckpt/minigpt4/resolve/main/blip2_pretrained_flant5xxl.pth -O /content/minigpt4/blip2_pretrained_flant5xxl.pth

!pip install -q salesforce-lavis !pip install -q bitsandbytes !pip install -q accelerate !pip install -q gradio==3.27.0 !pip install -q git+https://github.com/huggingface/transformers.git -U

%cd /content/minigpt4 !python app.py

kimiller commented 1 year ago

@SnakeHacker 哈哈哈笑死我了

kimiller commented 1 year ago

@silent780 not sure about that if it is normal , did you deploy it step by step as toturial providing? I used another method to deployed it,I am going to deploy it step by step as toturial providing

Korner83 commented 1 year ago

@kimiller follow the steps and you should call demo.py and not the app.py I think with these paramteres: --cfg-path eval_configs/minigpt4_eval.yaml --gpu-id 0

on windows: python demo.py --cfg-path eval_configs/minigpt4_eval.yaml --gpu-id 0

kimiller commented 1 year ago

@Korner83 Thank you!!!Let me try it

SnakeHacker commented 1 year ago

模型骂人了，学到了国粹。。

thiner commented 1 year ago

我的demo第一次也出现了这种胡说八道的情况。后来发现是我在修改minigpt4/configs/models/minigpt4.yaml文件时，误将llama_model指向了vicuna的delta文件。你不一定是跟我一样的情况，但是建议仔细检查一下配置文件是否正确。

kimiller commented 1 year ago

@thiner 我测了一下minigpt4 在线demo的情况，也是有乱说的时候，识图的功能应该还没这么完善吧，不过这已经做的不错了

chenchuntan commented 1 year ago

有没有最低要求的配置我电脑不行 CPU是I5 显卡是1060的

youyuanrsq commented 1 year ago

有没有最低要求的配置我电脑不行 CPU是I5 显卡是1060的

跑7B的模型最低要12G的显存

YiyangZhou commented 1 year ago

I have tried many models and found that mPLUG-Owl https://github.com/X-PLUG/mPLUG-Owl seems to have much stronger visual capabilities than the others, with more accurate image recognition

kimiller commented 1 year ago

@YiyangZhou thank you, tested this one, but it seems to be a copy of MiniGPT-4.

vateye commented 1 year ago

@YiyangZhou thank you, tested this one, but it seems to be a copy of MiniGPT-4.

No, I have tested it. It seems to perform better than miniGPT4 with the smaller model (ViT-L and LLaMA 7B only).

kimiller commented 1 year ago

@YiyangZhou thank you, tested this one, but it seems to be a copy of MiniGPT-4.

No, I have tested it. It seems to perform better than miniGPT4 with the smaller model (ViT-L and LLaMA 7B only).

@vateye yes.I agree with that it's performance better than miniGPT4 . looks like just made some optimizations , no essential changes.

YuzhouPeng commented 1 year ago

https://github.com/X-PLUG/mPLUG-Owl

Could you please tell me what models have you tried? I may want to try these models also. Thank you so much!

YuzhouPeng commented 1 year ago

I have tried many models and found that mPLUG-Owl https://github.com/X-PLUG/mPLUG-Owl seems to have much stronger visual capabilities than the others, with more accurate image recognition

Could you please tell me what models have you tried? I may want to try these models also. Thank you so much!

YiyangZhou commented 1 year ago

I have tried many models and found that mPLUG-Owl https://github.com/X-PLUG/mPLUG-Owl seems to have much stronger visual capabilities than the others, with more accurate image recognition

Could you please tell me what models have you tried? I may want to try these models also. Thank you so much!

minigpt4:https://github.com/Vision-CAIR/MiniGPT-4 llava:https://github.com/haotian-liu/LLaVA open_flamingo:https://github.com/mlfoundations/open_flamingo mPLUG-Owl:https://github.com/X-PLUG/mPLUG-Owl BLIP2: (1)Try it with the transformers package：https://huggingface.co/docs/transformers/index (2)lavis:https://github.com/salesforce/LAVIS