ChatGPTNextWeb / ChatGPT-Next-Web

A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。
https://app.nextchat.dev/
MIT License
76.96k stars 59.35k forks source link

[Feature Request] 更灵活的视觉模型判别 #5843

Open QAbot-zh opened 1 week ago

QAbot-zh commented 1 week ago

🥰 需求描述

当前项目采用固定的关键词、排除关键词的方案进行视觉模型判别(isVisionModel),加上各模型厂商并没有采取一致的命名方案,导致模型视觉判别滞后和频繁修改,如最新的 gemini-exp-1114 也支持视觉能力了,但是当前的视觉判别不能直接适配,急需优化更灵活的视觉模型判别方法

🧐 解决方案

可能的解决方案:

  1. 允许通过环境变量给指定的模型加上视觉能力,如: VisionModel=model_1,model_2,model_3
  2. 允许前端网页配置、后台解析支持视觉能力的模型
  3. ...

📝 补充信息

No response

Issues-translate-bot commented 1 week ago

Bot detected the issue body's language is not English, translate it automatically.


Title: [Feature Request] More flexible visual model discrimination

🥰 Description of requirements

The current project uses fixed keywords and excludes keywords for visual model identification (isVisionModel). In addition, each model manufacturer does not adopt a consistent naming scheme, resulting in lag in model visual identification and frequent modifications, such as the latest gemini-exp- 1114 also supports visual capabilities, but the current visual discrimination cannot be directly adapted, and there is an urgent need to optimize a more flexible visual model discrimination method.

🧐 Solution

Possible solutions:

  1. Allows adding vision capabilities to the specified model through environment variables, such as: VisionModel=model_1,model_2,model_3
  2. A model that allows front-end web page configuration and background parsing to support visual capabilities
  3. ...

📝 Supplementary information

No response