使用chatglm2回答的问题质量很差

labring / FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.

https://tryfastgpt.ai

Other

17.49k stars 4.69k forks source link

使用chatglm2回答的问题质量很差 #184

Closed bruceye777 closed 1 year ago

bruceye777 commented 1 year ago

基于此文档《3分钟在Fastgpt上用上GLM》 https://doc.fastgpt.run/docs/other/ChatGLM2/

不过有报错 File "/usr/local/lib/python3.10/dist-packages/pydantic/main.py", line 945, in json raise TypeError('dumps_kwargs keyword arguments are no longer supported.') TypeError: dumps_kwargs keyword arguments are no longer supported.

修改了代码 yield "{}".format(chunk.json(exclude_unset=True, ensure_ascii=False)) 修改为 yield "{}".format(chunk.json())

然后分别尝试使用了QA拆分和CSV导入两种方式，原始QA数据就是20个问题。

测试下来回答质量很差，可以说完全不着边际完全匹配不到数据集中的内容

这是哪里出了问题啊？

c121914yu commented 1 year ago

glm是语言模型，和匹配没关系

bruceye777 commented 1 year ago

那个脚本用的向量模型是m3e-large，我看在huggingface上面下载量排在很前面，那应该用什么模型效果会更好点了？ @c121914yu

c121914yu commented 1 year ago

m3e要归一化并且升纬到1536才能用

bruceye777 commented 1 year ago

我看了原始脚本有升纬到1536看，至于归一化我在encode加入了参数normalize_embeddings=True，不过效果依旧，还是匹配不到文本

azoth07 commented 1 year ago

glm自身语义理解问题，感觉相比gpt需要更多prompt去补充

winnie1228 commented 1 year ago

我看了原始脚本有升纬到1536看，至于归一化我在encode加入了参数normalize_embeddings=True，不过效果依旧，还是匹配不到文本

请问这个脚本在哪里获取的，方便提供下吗？我是想接m3e-base

bruceye777 commented 1 year ago

我看了原始脚本有升纬到1536看，至于归一化我在encode加入了参数normalize_embeddings=True，不过效果依旧，还是匹配不到文本

请问这个脚本在哪里获取的，方便提供下吗？我是想接m3e-base

就在这个项目里面提供的，你搜下， openai_api.py

yaya929 commented 10 months ago

请问解决了吗

c121914yu commented 10 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Is it solved?

yaya929 commented 10 months ago

我也是这个问题，请问源码哪里需要修改

c121914yu commented 10 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

I also have the same problem. May I ask where in the source code I need to modify it?

lonngxiang commented 6 months ago

我这是m3e向量检索相似度非常小，每次回答都没有引用 eea9ef692a027c35d2050e46f939098