lobehub / lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.
https://chat-preview.lobehub.com
Other
40.93k stars 9.33k forks source link

gpt4-vision-preview版本+内置Dall E3 or授粉绘画都无法画图 #849

Closed liliwen365 closed 7 months ago

liliwen365 commented 8 months ago

💻 系统环境 | Operating System

Windows

🌐 浏览器 | Browser

Chrome

🐛 问题描述 | Bug Description

gpt4-vision-preview版本+内置Dall E3 or授粉绘画都无法画图,显示空白,没有给出图片,如下: image

🚦 期望结果 | Expected Behavior

No response

📷 复现步骤 | Recurrence Steps

No response

📝 补充信息 | Additional Information

No response

lobehubbot commented 8 months ago

👀 @liliwen365

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. Please make sure you have given us as much context as possible.\ 非常感谢您提交 issue。我们会尽快调查此事,并尽快回复您。 请确保您已经提供了尽可能多的背景信息。

arvinxx commented 8 months ago

因为 4v 不支持 tool calls

lobehubbot commented 8 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Because 4v does not support tool calls

mushan0x0 commented 8 months ago

@arvinxx 我看可以加一个 map 映射,里面有 functions 的和 upload 这些字段,切换到对应模型就隐藏不支持的功能

lobehubbot commented 8 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@arvinxx I think you can add a map mapping, which contains functions and upload fields. Switching to the corresponding model will hide unsupported functions.

arvinxx commented 8 months ago

@arvinxx 我看可以加一个 map 映射,里面有 functions 的和 upload 这些字段,切换到对应模型就隐藏不支持的功能

我现在的想法是把 vision model 作为一个额外能力提供出来。

在处理消息的链路上,先分析一下消息中是否包含图片。如果包含,先走一轮vision model 完成识别,然后再触发一次正常的ai消息。

这样一来就能实现类似现在chatgpt 识别图片然后再调用插件进行会话的效果了

lobehubbot commented 8 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@arvinxx I think you can add a map mapping, which contains functions and upload fields. Switching to the corresponding model will hide unsupported functions.

My current idea is to provide vision model as an additional capability. Then on the link that processes the message, first analyze whether the message contains a picture. If included, first go through a round of vision model to complete the recognition. Then trigger a normal ai message again.

In this way, you can achieve an effect similar to the current chatgpt recognition of pictures and then call the plug-in for conversation.

mushan0x0 commented 8 months ago

@arvinxx 我看可以加一个 map 映射,里面有 functions 的和 upload 这些字段,切换到对应模型就隐藏不支持的功能

我现在的想法是把 vision model 作为一个额外能力提供出来。

在处理消息的链路上,先分析一下消息中是否包含图片。如果包含,先走一轮vision model 完成识别,然后再触发一次正常的ai消息。

这样一来就能实现类似现在chatgpt 识别图片然后再调用插件进行会话的效果了

那这样就相当于实现 GPT4 All Tools 了,而且应该 3.5 也支持👍

lobehubbot commented 8 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@arvinxx I think you can add a map mapping, which contains functions and upload fields. Switching to the corresponding model will hide unsupported functions.

My current idea is to provide vision model as an additional capability.

On the link that processes messages, first analyze whether the message contains pictures. If included, first go through the vision model to complete the recognition, and then trigger a normal ai message.

In this way, you can achieve an effect similar to the current chatgpt recognition of pictures and then call the plug-in for conversation.

Then this is equivalent to implementing GPT4 All Tools, and it should also be supported by 3.5👍

lobehubbot commented 7 months ago

✅ @liliwen365

This issue is closed, If you have any questions, you can comment and reply.\ 此问题已经关闭。如果您有任何问题,可以留言并回复。

lobehubbot commented 7 months ago

:tada: This issue has been resolved in version 0.123.0 :tada:

The release is available on:

Your semantic-release bot :package::rocket: