[Feature Request] 更灵活的视觉模型判别

ChatGPTNextWeb / ChatGPT-Next-Web

A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini 应用。

MIT License

76.96k stars 59.35k forks source link

Bot detected the issue body's language is not English, translate it automatically.

Title: [Feature Request] More flexible visual model discrimination

🥰 Description of requirements

The current project uses fixed keywords and excludes keywords for visual model identification (isVisionModel). In addition, each model manufacturer does not adopt a consistent naming scheme, resulting in lag in model visual identification and frequent modifications, such as the latest gemini-exp- 1114 also supports visual capabilities, but the current visual discrimination cannot be directly adapted, and there is an urgent need to optimize a more flexible visual model discrimination method.

🧐 Solution

Possible solutions:

Allows adding vision capabilities to the specified model through environment variables, such as: VisionModel=model_1,model_2,model_3
A model that allows front-end web page configuration and background parsing to support visual capabilities
...

📝 Supplementary information

No response

ChatGPTNextWeb / ChatGPT-Next-Web