labring / FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
https://tryfastgpt.ai
Other
17.49k stars 4.69k forks source link

使用chatglm2回答的问题质量很差 #184

Closed bruceye777 closed 1 year ago

bruceye777 commented 1 year ago

基于此文档《3分钟在Fastgpt上用上GLM》 https://doc.fastgpt.run/docs/other/ChatGLM2/

不过有报错 File "/usr/local/lib/python3.10/dist-packages/pydantic/main.py", line 945, in json raise TypeError('dumps_kwargs keyword arguments are no longer supported.') TypeError: dumps_kwargs keyword arguments are no longer supported.

修改了代码 yield "{}".format(chunk.json(exclude_unset=True, ensure_ascii=False)) 修改为 yield "{}".format(chunk.json())

然后分别尝试使用了QA拆分和CSV导入两种方式,原始QA数据就是20个问题。

测试下来回答质量很差,可以说完全不着边际 image image image image 完全匹配不到数据集中的内容 image image

这是哪里出了问题啊?

c121914yu commented 1 year ago

glm是语言模型,和匹配没关系

bruceye777 commented 1 year ago

那个脚本用的向量模型是m3e-large,我看在huggingface上面下载量排在很前面,那应该用什么模型效果会更好点了? @c121914yu

c121914yu commented 1 year ago

m3e要归一化并且升纬到1536才能用

bruceye777 commented 1 year ago

我看了原始脚本有升纬到1536看,至于归一化我在encode加入了参数normalize_embeddings=True,不过效果依旧,还是匹配不到文本 image

azoth07 commented 1 year ago

glm自身语义理解问题,感觉相比gpt需要更多prompt去补充

winnie1228 commented 1 year ago

我看了原始脚本有升纬到1536看,至于归一化我在encode加入了参数normalize_embeddings=True,不过效果依旧,还是匹配不到文本 image

请问这个脚本在哪里获取的,方便提供下吗? 我是想接m3e-base

bruceye777 commented 1 year ago

我看了原始脚本有升纬到1536看,至于归一化我在encode加入了参数normalize_embeddings=True,不过效果依旧,还是匹配不到文本 image

请问这个脚本在哪里获取的,方便提供下吗? 我是想接m3e-base

就在这个项目里面提供的,你搜下, openai_api.py

yaya929 commented 10 months ago

请问解决了吗

c121914yu commented 10 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Is it solved?

yaya929 commented 10 months ago

我也是这个问题,请问源码哪里需要修改

c121914yu commented 10 months ago

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


I also have the same problem. May I ask where in the source code I need to modify it?

lonngxiang commented 6 months ago

我这是m3e向量检索相似度非常小,每次回答都没有引用 eea9ef692a027c35d2050e46f939098