bug: Problems with image uploads under a multimodal model.

janhq / jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)

https://jan.ai/

GNU Affero General Public License v3.0

20.7k stars 1.19k forks source link

bug: Problems with image uploads under a multimodal model. #2973

Open Danielyy opened 1 month ago

Danielyy commented 1 month ago

Describe the bug When selecting the GPT-4 multimodal model, it is not possible to upload images in the input box.

Screenshots

Environment details MAC M2

pkirilin commented 1 month ago

You can try the following:

Open model.json and set settings.vision_model to true for gpt 4o (you can do this directly from UI, see the screenshot below)
Restart the application
Change the model to something else, then select gpt 4o again

It worked for me

pkirilin commented 1 month ago

However, there's still something wrong. When I send image with some text prompt, Jan says that my API key is invalid, but it's not true. It works fine without images with the same API key. Seems like sending images are not handled correctly at the moment (v0.4.14).

Van-QA commented 1 month ago

Sor‌ry for th‌e i‌nconvenienc‌e, the GPT-4o API that included in Jan app do‌es not support ‌vision. We will add another API of GPT-4o to support vision function to Jan app soon. Stay tuned 🙏

kalle07 commented 2 weeks ago

only our internal llava v1.5 (v1) is working with images ... all models iv downloaded from hugging llava 1.5 or 1.6 or llama3_llava dont work, greyed "image symbol" like above... how i can enable ?

only some models, it give a lot more: https://huggingface.co/cjpais/llava-1.6-mistral-7b-gguf https://huggingface.co/PsiPi/liuhaotian_llava-v1.5-13b-GGUF https://huggingface.co/xtuner/llava-llama-3-8b-v1_1-gguf

Van-QA commented 2 weeks ago

hi @kalle07 If you are certain that the model support images, you can try m‌‌od‌‌ifyi‌ng the ‌mod‌el json to enable the settin‌, please check out this guideline from https://github.com/janhq/jan/issues/2973#issuecomment-2142889621 🙏

kalle07 commented 2 weeks ago

answer is only "OK" no image describtion

how should be the whole file look like for the normal standart llava 1.5