lobehub / lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.
https://chat-preview.lobehub.com
Other
44.34k stars 9.94k forks source link

[Bug] 还是关于智谱api问题 #3909

Open BiFangKNT opened 1 month ago

BiFangKNT commented 1 month ago

🥰 需求描述

我现在高度怀疑之前glm-4v-plus的api调用问题 #3863 是由于没有给它打上<file>的tag导致的,类似如下 image 而目前的glm-4v-plus并没有这个tag image

建议:增加环境变量ZHIPU_MODEL_LIST使能够自定义添加tag,并且尽快更新默认的tag

参考文档:文档 image

🧐 解决方案

建议:增加环境变量ZHIPU_MODEL_LIST使能够自定义添加tag,并且尽快更新默认的tag

参考文档:文档 image

📝 补充信息

No response

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


🥰 Description of requirements

I now highly suspect that the previous api calling problem of glm-4v-plus was caused by not tagging it with , similar to the following image The current glm-4v-plus does not have this tag. image

Suggestion: Increase the environment variable ZHIPU_MODEL_LIST to enable custom tags, and update the default tag as soon as possible

Reference document: Document image

🧐 Solution

Suggestion: Increase the environment variable ZHIPU_MODEL_LIST to enable custom tags, and update the default tag as soon as possible

Reference document: Document image

📝 Supplementary information

No response

lobehubbot commented 1 month ago

👀 @BiFangKNT

Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. Please make sure you have given us as much context as possible.\ 非常感谢您提交 issue。我们会尽快调查此事,并尽快回复您。 请确保您已经提供了尽可能多的背景信息。

BiFangKNT commented 1 month ago

我设置LLM_VISION_IMAGE_USE_BASE64=1image 原本能正常识别的模型反而不能识别了,还报dns解析这种奇怪的错 image 而我再把这个环境变量删除,重启项目,就能正常识别了 image

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


After I set LLM_VISION_IMAGE_USE_BASE64=1 image The model that could be recognized normally cannot be recognized anymore, and strange errors such as dns resolution are reported. image And if I delete this environment variable and restart the project, it will be recognized normally. image

dosubot[bot] commented 1 month ago

A similar issue was discussed in a closed thread: 自定义模型gpt-4-turbo-2024-04-09,不支持上传图片. The solution involved ensuring the model supports image recognition and configuring it accordingly in the settings.

To address your request, you can add an environment variable ZHIPU_MODEL_LIST to allow custom tags for models like GLM-4V-Plus and update the default tags to include file upload support. Here is how you can modify the ZhiPu configuration:

  1. Add the environment variable handling.
  2. Update the default tags to include file upload support.

Here is the updated code:

import { ModelProviderCard } from '@/types/llm';

// Function to get custom models from environment variable
const getCustomModels = () => {
  const customModels = process.env.ZHIPU_MODEL_LIST;
  return customModels ? JSON.parse(customModels) : [];
};

// Default models
const defaultModels = [
  {
    description:
      'GLM-4-AllTools 是专门为支持智能体和相关任务而进一步优化的模型版本。它能够自主理解用户的意图,规划复杂的指令,并能够调用一个或多个工具(例如网络浏览器、代码解释器和文本生图像)以完成复杂的任务。',
    displayName: 'GLM-4-AllTools',
    enabled: true,
    functionCall: true,
    id: 'glm-4-alltools',
    tokens: 128_000,
  },
  {
    description: '高智能旗舰:性能全面提升,长文本和复杂任务能力显著增强',
    displayName: 'GLM-4-Plus',
    enabled: true,
    functionCall: true,
    id: 'glm-4-plus',
    tokens: 128_000,
  },
  {
    description: '高智能模型:适用于处理高度复杂和多样化的任务',
    displayName: 'GLM-4-0520',
    enabled: true,
    functionCall: true,
    id: 'glm-4-0520',
    tokens: 128_000,
  },
  {
    description: '旧版旗舰:发布于2024年1月16日,目前已被 GLM-4-0520 取代', // deprecated on 2025-06
    displayName: 'GLM-4',
    functionCall: true,
    id: 'glm-4',
    tokens: 128_000,
  },
  {
    description: '性价比最高的版本,综合性能接近GLM-4,速度快,价格实惠',
    displayName: 'GLM-4-Air',
    enabled: true,
    functionCall: true,
    id: 'glm-4-air',
    tokens: 128_000,
  },
  {
    description: 'GLM-4-Air 的高性能版本,效果不变,推理速度达到其2.6倍',
    displayName: 'GLM-4-AirX',
    functionCall: true,
    id: 'glm-4-airx',
    tokens: 8192,
  },
  {
    description: '超长输入:专为处理超长文本和记忆型任务设计',
    displayName: 'GLM-4-Long',
    enabled: true,
    functionCall: true,
    id: 'glm-4-long',
    tokens: 1_024_000,
  },
  {
    description: '适用简单任务,速度最快,价格最实惠的版本',
    displayName: 'GLM-4-Flash',
    enabled: true,
    functionCall: true,
    id: 'glm-4-flash',
    tokens: 128_000,
  },
  {
    description:
      '视频和图像理解:具备视频内容和多图片的理解能力',
    displayName: 'GLM-4V-Plus',
    enabled: true,
    id: 'glm-4v-plus',
    tokens: 8192,
    vision: true,
    fileUpload: true, // Added file upload support
  },
  {
    description:
      '图像理解:具备图像理解能力和推理能力',
    displayName: 'GLM-4V',
    enabled: true,
    id: 'glm-4v',
    tokens: 2048,
    vision: true,
  },
  {
    description:
      'CodeGeeX是一款强大的AI编程助手,提供智能问答和代码补全功能,支持多种编程语言,帮助开发者提高编程效率。',
    displayName: 'CodeGeeX-4',
    id: 'codegeex-4',
    tokens: 128_000,
  },
  {
    description:
      '支持基于人设的角色扮演、超长多轮的记忆、千人千面的角色对话,广泛应用于情感陪伴、游戏智能NPC、网红/明星/影视剧IP分身、数字人/虚拟主播、文字冒险游戏等拟人对话或游戏场景。',
    displayName: 'CharGLM-3',
    id: 'charglm-3',
    tokens: 4096,
  },
  {
    description: '心理模型:具备专业咨询能力,帮助用户理解情感并应对情绪问题',
    displayName: 'Emohaa',
    id: 'emohaa',
    tokens: 8192,
  },
];

// Merge default models with custom models from environment variable
const chatModels = [...defaultModels, ...getCustomModels()];

const ZhiPu: ModelProviderCard = {
  chatModels,
  checkModel: 'glm-4-flash',
  id: 'zhipu',
  name: 'ZhiPu',
};

export default ZhiPu;

This code will allow you to add custom models via the ZHIPU_MODEL_LIST environment variable and update the default tags to include file upload support [1][2][3].

To continue talking to Dosu, mention @dosu.

BiFangKNT commented 1 month ago

至于为什么不用one api,然后在OPENAI_MODEL_LIST中配置,那是因为我在这里 #3884 提到的,我需要将不同账号的可用模型分开设置,按照我在文中的方案可以区分主副账号的可用模型,虽然没有多账户管理但也可堪一用。但是假如我把智谱所有的模型整合进one api里的话,就无法区分账号可用模型了。

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


As for why I don’t use one api and configure it in OPENAI_MODEL_LIST, that’s because as I mentioned in #3884 here, I need to set the available models of different accounts separately. According to the plan I wrote in the article, I can distinguish the available models of the primary and secondary accounts. The model, although it does not have multi-account management, can still be used. But if I integrate all the APIs of Zhipu into one API, I will not be able to distinguish the models available for the account.

BiFangKNT commented 1 month ago

A similar issue was discussed in a closed thread: 自定义模型gpt-4-turbo-2024-04-09,不支持上传图片. The solution involved ensuring the model supports image recognition and configuring it accordingly in the settings.一个类似的问题在已关闭的讨论中有所提及:自定义模型 gpt-4-turbo-2024-04-09,不支持上传图片。解决方案包括确保模型支持图像识别,并在设置中相应地进行配置。

To address your request, you can add an environment variable ZHIPU_MODEL_LIST to allow custom tags for models like GLM-4V-Plus and update the default tags to include file upload support. Here is how you can modify the ZhiPu configuration:为满足您的请求,您可以添加环境变量ZHIPU_MODEL_LIST以允许为类似GLM-4V-Plus的模型自定义标签,并更新默认标签以包含文件上传支持。以下是您如何修改ZhiPu配置的方法:

  1. Add the environment variable handling.添加环境变量处理。
  2. Update the default tags to include file upload support.更新默认标签以包括文件上传支持。

Here is the updated code:以下是更新后的代码:

import { ModelProviderCard } from '@/types/llm';

// Function to get custom models from environment variable
const getCustomModels = () => {
  const customModels = process.env.ZHIPU_MODEL_LIST;
  return customModels ? JSON.parse(customModels) : [];
};

// Default models
const defaultModels = [
  {
    description:
      'GLM-4-AllTools 是专门为支持智能体和相关任务而进一步优化的模型版本。它能够自主理解用户的意图,规划复杂的指令,并能够调用一个或多个工具(例如网络浏览器、代码解释器和文本生图像)以完成复杂的任务。',
    displayName: 'GLM-4-AllTools',
    enabled: true,
    functionCall: true,
    id: 'glm-4-alltools',
    tokens: 128_000,
  },
  {
    description: '高智能旗舰:性能全面提升,长文本和复杂任务能力显著增强',
    displayName: 'GLM-4-Plus',
    enabled: true,
    functionCall: true,
    id: 'glm-4-plus',
    tokens: 128_000,
  },
  {
    description: '高智能模型:适用于处理高度复杂和多样化的任务',
    displayName: 'GLM-4-0520',
    enabled: true,
    functionCall: true,
    id: 'glm-4-0520',
    tokens: 128_000,
  },
  {
    description: '旧版旗舰:发布于2024年1月16日,目前已被 GLM-4-0520 取代', // deprecated on 2025-06
    displayName: 'GLM-4',
    functionCall: true,
    id: 'glm-4',
    tokens: 128_000,
  },
  {
    description: '性价比最高的版本,综合性能接近GLM-4,速度快,价格实惠',
    displayName: 'GLM-4-Air',
    enabled: true,
    functionCall: true,
    id: 'glm-4-air',
    tokens: 128_000,
  },
  {
    description: 'GLM-4-Air 的高性能版本,效果不变,推理速度达到其2.6倍',
    displayName: 'GLM-4-AirX',
    functionCall: true,
    id: 'glm-4-airx',
    tokens: 8192,
  },
  {
    description: '超长输入:专为处理超长文本和记忆型任务设计',
    displayName: 'GLM-4-Long',
    enabled: true,
    functionCall: true,
    id: 'glm-4-long',
    tokens: 1_024_000,
  },
  {
    description: '适用简单任务,速度最快,价格最实惠的版本',
    displayName: 'GLM-4-Flash',
    enabled: true,
    functionCall: true,
    id: 'glm-4-flash',
    tokens: 128_000,
  },
  {
    description:
      '视频和图像理解:具备视频内容和多图片的理解能力',
    displayName: 'GLM-4V-Plus',
    enabled: true,
    id: 'glm-4v-plus',
    tokens: 8192,
    vision: true,
    fileUpload: true, // Added file upload support
  },
  {
    description:
      '图像理解:具备图像理解能力和推理能力',
    displayName: 'GLM-4V',
    enabled: true,
    id: 'glm-4v',
    tokens: 2048,
    vision: true,
  },
  {
    description:
      'CodeGeeX是一款强大的AI编程助手,提供智能问答和代码补全功能,支持多种编程语言,帮助开发者提高编程效率。',
    displayName: 'CodeGeeX-4',
    id: 'codegeex-4',
    tokens: 128_000,
  },
  {
    description:
      '支持基于人设的角色扮演、超长多轮的记忆、千人千面的角色对话,广泛应用于情感陪伴、游戏智能NPC、网红/明星/影视剧IP分身、数字人/虚拟主播、文字冒险游戏等拟人对话或游戏场景。',
    displayName: 'CharGLM-3',
    id: 'charglm-3',
    tokens: 4096,
  },
  {
    description: '心理模型:具备专业咨询能力,帮助用户理解情感并应对情绪问题',
    displayName: 'Emohaa',
    id: 'emohaa',
    tokens: 8192,
  },
];

// Merge default models with custom models from environment variable
const chatModels = [...defaultModels, ...getCustomModels()];

const ZhiPu: ModelProviderCard = {
  chatModels,
  checkModel: 'glm-4-flash',
  id: 'zhipu',
  name: 'ZhiPu',
};

export default ZhiPu;

This code will allow you to add custom models via the ZHIPU_MODEL_LIST environment variable and update the default tags to include file upload support [1][2][3]。此代码将允许您通过ZHIPU_MODEL_LIST环境变量添加自定义模型,并更新默认标签以包括文件上传支持[1][2][3]

To continue talking to Dosu, mention @dosu。要继续与Dosu交谈,提及。

不只是glm-4v-plus需要增加这个标签,glm-4v也需要,并且也许所有视觉模型都需要

BiFangKNT commented 1 month ago

@dosu ,帮我把此议题的Feature Request标签改为Bug标签吧

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@dosu, please help me change the Feature Request tag of this issue to the Bug tag

dosubot[bot] commented 1 month ago

你不能使用一个 API 并在 OPENAI_MODEL_LIST 中配置它来区分不同账户的可用模型,因为模型列表的配置是特定于提供者及其各自的模型的。每个提供者都有自己的一组模型和配置,它们是不可互换的。

要设置 ZHIPU_MODEL_LIST 环境变量以允许为 GLM-4V-PlusGLM-4V 等模型自定义标签,可以按以下方式进行配置:

const ZhiPu: ModelProviderCard = {
  chatModels: [
    {
      description: 'GLM-4V-Plus 具备对视频内容及多图片的理解能力,适合多模态任务。',
      displayName: 'GLM-4V-Plus',
      enabled: true,
      id: 'glm-4v-plus',
      pricing: {
        currency: 'CNY',
        input: 10,
        output: 10,
      },
      tokens: 8192,
      vision: true,
      fileUpload: true, // Added file upload support
    },
    {
      description: 'GLM-4V 提供强大的图像理解与推理能力,支持多种视觉任务。',
      displayName: 'GLM-4V',
      id: 'glm-4v',
      pricing: {
        currency: 'CNY',
        input: 50,
        output: 50,
      },
      tokens: 2048,
      vision: true,
      fileUpload: true, // Added file upload support
    },
    // ... other models
  ],
  checkModel: 'glm-4-flash',
  description: '智谱 AI 提供多模态与语言模型的开放平台,支持广泛的AI应用场景,包括文本处理、图像理解与编程辅助等。',
  id: 'zhipu',
  modelsUrl: 'https://open.bigmodel.cn/dev/howuse/model',
  name: 'ZhiPu',
  url: 'https://zhipuai.cn',
};
export default ZhiPu;

此外,确保在配置中正确设置了 ZHIPU_MODEL_LIST 环境变量:

export const llmEnv = getLLMConfig({
  ENABLED_ZHIPU: !!process.env.ZHIPU_API_KEY,
  ZHIPU_API_KEY: process.env.ZHIPU_API_KEY,
  ZHIPU_MODEL_LIST: process.env.ZHIPU_MODEL_LIST,
});

此配置包括 GLM-4V-PlusGLM-4V 模型及其各自的标签和设置 [1][2]

标签已更改为 Bug。如果还有其他问题,请随时告知。

To continue talking to Dosu, mention @dosu.

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


You cannot use an API and configure it in OPENAI_MODEL_LIST to differentiate the available models for different accounts, because the configuration of the model list is specific to the provider and its respective model. Each provider has its own set of models and configurations, which are not interchangeable.

To set the ZHIPU_MODEL_LIST environment variable to allow custom labels for models such as GLM-4V-Plus and GLM-4V, you can configure it as follows:

const ZhiPu: ModelProviderCard = {
  chatModels: [
    {
      description: 'GLM-4V-Plus has the ability to understand video content and multiple pictures, and is suitable for multi-modal tasks. ',
      displayName: 'GLM-4V-Plus',
      enabled: true,
      id: 'glm-4v-plus',
      pricing: {
        currency: 'CNY',
        input: 10,
        output: 10,
      },
      tokens: 8192,
      vision: true,
      fileUpload: true, // Added file upload support
    },
    {
      description: 'GLM-4V provides powerful image understanding and reasoning capabilities, supporting a variety of visual tasks. ',
      displayName: 'GLM-4V',
      id: 'glm-4v',
      pricing: {
        currency: 'CNY',
        input: 50,
        output: 50,
      },
      tokens: 2048,
      vision: true,
      fileUpload: true, // Added file upload support
    },
    // ... other models
  ],
  checkModel: 'glm-4-flash',
  description: 'Zhipu AI provides an open platform for multi-modal and language models, supporting a wide range of AI application scenarios, including text processing, image understanding, and programming assistance. ',
  id: 'zhipu',
  modelsUrl: 'https://open.bigmodel.cn/dev/howuse/model',
  name: 'ZhiPu',
  url: 'https://zhipuai.cn',
};
export default ZhiPu;

Additionally, make sure the ZHIPU_MODEL_LIST environment variable is set correctly in the configuration:

export const llmEnv = getLLMConfig({
  ENABLED_ZHIPU: !!process.env.ZHIPU_API_KEY,
  ZHIPU_API_KEY: process.env.ZHIPU_API_KEY,
  ZHIPU_MODEL_LIST: process.env.ZHIPU_MODEL_LIST,
});

This configuration includes the GLM-4V-Plus and GLM-4V models and their respective labels and settings [[1]](https://github.com/lobehub/lobe-chat/blob/main /src/config/modelProviders/zhipu.ts)[2] .

The label has been changed to Bug. If you have any further questions, please feel free to let us know.

To continue talking to Dosu, mention @dosu.

BiFangKNT commented 1 month ago

@arvinxx 大佬来看看呗,我怕此议题被刷到下面去了您看不见

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@arvinxx Boss, come and take a look. I’m afraid this issue has been brushed down so you can’t see it.

arvinxx commented 1 month ago

@BiFangKNT ZHIPU_MODEL_LIST 这个应该是有的吧

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@BiFangKNT ZHIPU_MODEL_LIST This should be available, right?

BiFangKNT commented 1 month ago

@BiFangKNT ZHIPU_MODEL_LIST 这个应该是有的吧

文档里没写。而且这是其一,其二是建议更新一下官方默认tag吧,目前为止智谱的视觉模型默认还是没有上传功能,而且glm-4-alltool这个多模态模型默认甚至连视觉功能也不支持

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@BiFangKNT ZHIPU_MODEL_LIST This should be available, right?

It's not written in the document. And this is one, and the second is to update the official default tag. So far, the visual model of Zhipu does not have the upload function by default, and the multi-modal model glm-4-alltool does not even support the visual function by default.

BiFangKNT commented 1 month ago

@BiFangKNT ZHIPU_MODEL_LIST 这个应该是有的吧

我试过了,并没有。设置这个环境变量无法生效。

lobehubbot commented 1 month ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@BiFangKNT ZHIPU_MODEL_LIST This should be available, right?

I tried it and it didn't work. Setting this environment variable does not take effect.

BiFangKNT commented 1 month ago

@arvinxx 大佬,查明了,智谱api不支持ipv6,所以不能用只有ipv6的url传递 image

但是可以通过转换为base64传递。可以在相关逻辑里添加一个判断,如果是智谱系的模型,就先转为base64。或者在智谱的模型设置这里添加一个是否转换为base64的开关 image

测试脚本如下:

import base64
import os
import requests
from zhipuai import ZhipuAI

# 从 URL 获取图片并转换为 base64
def fetch_image_as_base64(url):
    """
    Fetches an image from the given URL and returns it as a base64-encoded string.
    """
    try:
        response = requests.get(url)
        response.raise_for_status()  # 检查请求是否成功
        image_data = base64.b64encode(response.content).decode('utf-8')
        return image_data
    except Exception as e:
        print(f"Error fetching image from {url}: {e}")
        return None

# 示例图片URL
image_url = "https://minio.example.top/lobe/files/479414/9294edbb-e5ca-4a4d-84b1-52685fb738bc.png"
img_base = fetch_image_as_base64(image_url)

if img_base:
    token = os.environ["ZHIPU_TOKEN"]
    client = ZhipuAI(api_key=token)  # 填写您自己的APIKey

    response = client.chat.completions.create(
        model="glm-4v-plus",  # 填写需要调用的模型名称
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": img_base  # 使用 base64 编码后的图片
                        }
                    },
                    {
                        "type": "text",
                        "text": "请描述这个图片"
                    }
                ]
            }
        ]
    )

    print(response.choices[0].message.content)
else:
    print("Failed to retrieve and encode image.")

返回结果如下:

这个图像是一个标志设计示例,通常被称为“Logo Ipsum”。它由一个色彩斑斓的图形标志和旁边的文字组成。标志图形是一个抽象的、渐变色的形状,看起来像是一个倒置的彩虹色三角形或山峰。标志右侧的文字是“Logo Ipsum”,字体简洁现代,给人一种专业的感觉。

“Logo Ipsum”是一个虚构的标志名称,通常用于图形设计示例或模板中,以展示标志的设计和布局。在实际应用中,这个标志和名称会被替换为具体品牌或公司的标志和名称。

进程已结束,退出代码为 0

并且还是建议增加ZHIPU_MODEL_LIST环境变量。

另外我还有一个问题: glm-4-alltools除了支持function call,还支持code_interpreterdrawing_toolweb_browser的,后三者现在有办法调用吗?