Open jackqdldd opened 2 days ago
请问安装方式是什么呢?如果是使用 pip包安装的话,请尝试换成使用main分支的代码安装,pip包更新可能不及时
用代码安装后上述问题解决了,但是执行的时候报错:
testset_generation.py:132: LangChainDeprecationWarning: The class
UnstructuredFileLoaderwas deprecated in LangChain 0.2.8 and will be removed in 1.0. An updated version of the class exists in the :class:
~langchain-unstructured package and should be used instead. To use it run pip install -U :class:
~langchain-unstructuredand import as
from :class:~langchain_unstructured import UnstructuredLoader``.
按上面的更新后继续执行,还是不对
File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/pydantic/main.py", line 212, in __init__ validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) pydantic_core._pydantic_core.ValidationError: 2 validation errors for LocalLLM model_name_or_path Field required [type=missing, input_value={'model_name': '/home/alg...': {'temperature': 0.2}}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing model Field required [type=missing, input_value={'model_name': '/home/alg...': {'temperature': 0.2}}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing
这是我的执行代码 `generate_testset_task_cfg = { "eval_backend": "RAGEval", "eval_config": { "tool": "RAGAS", "testset_generation": { "docs": ["/home/alg/qdl/rags/docs/zhidu_1.txt", "/home/alg/qdl/rags/docs/zhidu_2.txt", "/home/alg/qdl/rags/docs/zhidu_3.txt"], "test_size": 10, "output_file": "outputs/testset.json", "distribution": {"simple": 0.1, "multi_context": 0.4, "reasoning": 0.5}, "generator_llm": { "model_name_or_path": "/home/alg/qdl/model/Qwen2_5-7B-Instruct", "template_type": "qwen", "generation_config": {"temperature": 0.2} }, "embeddings": { "model_name_or_path": "/home/alg/qdl/model/CompassJudger-1-7B-Instruct", }, "language": "chinese" } }, }
from evalscope.run import run_task from evalscope.utils.logger import get_logger
logger = get_logger()
run_task(task_cfg=generate_testset_task_cfg) `
下面这一段是warning,不是报错,不影响使用:
testset_generation.py:132: LangChainDeprecationWarning: The class UnstructuredFileLoader was deprecated in LangChain 0.2.8 and will be removed in 1.0. An updated version of the class exists in the :class:~langchain-unstructured package and should be used instead. To use it run pip install -U :class:~langchain-unstructuredand import asfrom :class:~langchain_unstructured import UnstructuredLoader``.
升级后可能导致环境不兼容,ragas库目前需要langchain版本低于0.3,参考langchain库版本如下,在evalscope setup.py目录下运行pip install -e '.[rag]'
langchain 0.2.16
langchain-chroma 0.1.4
langchain-community 0.2.16
langchain-core 0.2.40
langchain-openai 0.1.23
langchain-text-splitters 0.2.4
langchain-unstructured 0.1.4
embeddings
部分应该使用embedding模型,而不是CompassJudger-1-7B-Instruct
,例如AI-ModelScope/bge-large-zh
更新后可以执行了,有下面报错: File "/home/alg/qdl/evalscope/package/evalscope/models/model_adapter.py", line 422, in init self.generation_config.update(**custom_generation_config.to_dict()) AttributeError: 'dict' object has no attribute 'to_dict'
应该是个bug,我们修复下
继续执行报错: File "/home/alg/qdl/evalscope/package/evalscope/backend/rag_eval/ragas/tasks/testset_generation.py", line 162, in generate_testset generator = TestsetGenerator.from_langchain(generator_llm) TypeError: TestsetGenerator.from_langchain() missing 1 required positional argument: 'embedding_model'
更新后可以继续执行,执行过程有报错:
继续执行报错: File "/home/alg/qdl/evalscope/package/evalscope/backend/rag_eval/ragas/tasks/testset_generation.py", line 162, in generate_testset generator = TestsetGenerator.from_langchain(generator_llm) TypeError: TestsetGenerator.from_langchain() missing 1 required positional argument: 'embedding_model'
更新后可以继续执行,执行过程有报错:
这个看起来是ragas昨天更新了新版本导致的,我一并修了
请拉取最新的代码尝试一下
参考设置 "generation_config": {"do_sample":True, "temperature": 0.1, "max_new_tokens": 2048}
或者去掉generation_config
Applying [SummaryExtractor, HeadlinesExtractor]: 0%| | 0/2 [00:00<?, ?it/s]2024-10-31 13:48:43,560 - ragas.testset.transforms.engine - ERROR - unable to apply transformation: 'Generation' object has no attribute 'message'
Applying [SummaryExtractor, HeadlinesExtractor]: 50%|█████ | 1/2 [00:08<00:08, 8.96s/it]]2024-10-31 13:48:43,560 - ragas.testset.transforms.engine - ERROR - unable to apply transformation: 'Generation' object has no attribute 'message'2024-10-31 13:49:41,834 - ragas.testset.transforms.engine - ERROR - unable to apply transformation: 'Generation' object has no attribute 'message'
Applying EmbeddingExtractor: 0%| | 0/1 [00:00<?, ?it/s]2024-10-31 13:49:41,837 - ragas.testset.transforms.engine - ERROR - unable to apply transformation: node.property('summary') must be a string, found '<class 'NoneType'>'
Applying HeadlineSplitter: 0%| | 0/1 [00:00<?, ?it/s]2024-10-31 13:49:41,837 - ragas.testset.transforms.engine - ERROR - unable to apply transformation: 'headlines' property not found in this node
Batches: 100%|██████████| 1/1 [00:00<00:00, 12.90it/s]leExtractor]: 0%| | 0/3 [00:00<?, ?it/s]
2024-10-31 13:49:49,124 - ragas.testset.transforms.engine - ERROR - unable to apply transformation: 'Generation' object has no attribute 'message'
Applying [EmbeddingExtractor, KeyphrasesExtractor, TitleExtractor]: 67%|██████▋ | 2/3 [00:07<00:03, 3.64s/it]2024-10-31 13:49:53,289 - ragas.testset.transforms.engine - ERROR - unable to apply transformation: 'Generation' object has no attribute 'message'
Traceback (most recent call last):
File "/home/alg/qdl/rags/evalscope_ragas_generate.py", line 32, in
请更换模型generator_llm,使用指令遵循能力更强的模型,例如72B-int4量化模型,较小的模型在执行复杂指令时输出可能不对,导致出错
Qwen2_5-7B-Instruct 不行么,我看文档里是这个
我试过了,不太行,也是一样的报错😭;但换成更大的模型是可以的,也可以用一些闭源模型,像gpt4o
我再更新一下文档
这是环境里面缺少了gptq所需要的包,尝试根据报错信息安装一下 pip install optimum
GenerationConfig 不能为空?
Traceback (most recent call last):
File "/home/alg/qdl/rags/evalscope_ragas_generate.py", line 32, in
generate_testset_task_cfg = { "eval_backend": "RAGEval", "eval_config": { "tool": "RAGAS", "testset_generation": { "docs": ["/home/alg/qdl/rags/docs/zhidu_1.txt", "/home/alg/qdl/rags/docs/zhidu_2.txt", "/home/alg/qdl/rags/docs/zhidu_3.txt"], "test_size": 10, "output_file": "outputs/testset.json", "distribution": {"simple": 0.1, "multi_context": 0.4, "reasoning": 0.5}, "generator_llm": { "model_name_or_path": "/home/alg/qdl/model/Qwen2.5-72B-Instruct-GPTQ-Int4", "template_type": "qwen" }, "embeddings": { "model_name_or_path": "/home/alg/qdl/model/BAAI/bge-large-zh-v1.5", }, "language": "chinese" } } }
麻烦再拉取代码尝试一下
一直卡在这里
推理速度比较慢,请等待一下,或尝试减少txt文档中的内容,看是否能跑通
Generating Scenarios: 0%| | 0/3 [00:00<?, ?it/s]2024-10-31 16:49:32,896 - ragas.testset.synthesizers.abstract_query - INFO - found 0 clusters 2024-10-31 16:49:32,896 - ragas.testset.synthesizers.abstract_query - INFO - generating 4 common_themes 2024-10-31 16:49:32,901 - ragas.testset.synthesizers.abstract_query - INFO - found 3 clusters 2024-10-31 16:49:32,901 - ragas.testset.synthesizers.abstract_query - INFO - generating 2 themes Traceback (most recent call last): File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/openai/_base_client.py", line 1572, in _request response = await self._client.send( File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpx/_client.py", line 1674, in send response = await self._send_handling_auth( File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpx/_client.py", line 1702, in _send_handling_auth response = await self._send_handling_redirects( File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpx/_client.py", line 1739, in _send_handling_redirects response = await self._send_single_request(request) File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpx/_client.py", line 1776, in _send_single_request response = await transport.handle_async_request(request) File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpx/_transports/default.py", line 377, in handle_async_request resp = await self._pool.handle_async_request(req) File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request raise exc from None File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request response = await connection.handle_async_request( File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpcore/_async/connection.py", line 101, in handle_async_request return await self._connection.handle_async_request(request) File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpcore/_async/http11.py", line 142, in handle_async_request await self._response_closed() File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpcore/_async/http11.py", line 257, in _response_closed await self.aclose() File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpcore/_async/http11.py", line 265, in aclose await self._network_stream.aclose() File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 55, in aclose await self._stream.aclose() File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 1258, in aclose self._transport.close() File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/asyncio/selector_events.py", line 706, in close self._loop.call_soon(self._call_connection_lost, None) File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/asyncio/base_events.py", line 753, in call_soon self._check_closed() File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/asyncio/base_events.py", line 515, in _check_closed raise RuntimeError('Event loop is closed') RuntimeError: Event loop is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/alg/qdl/rags/evalscope_ragas_generate.py", line 56, in
llm 模型换成API模型,Applying [EmbeddingExtractor, KeyphrasesExtractor, TitleExtractor]: 执行完成,Generating Scenarios: 报错了,
请问使用的是什么API模型,是否存在网络问题
前面调用都是通的
大概等待多久会报错,尝试修改下面 timeout 和 max_wait的值呢 https://github.com/modelscope/evalscope/blob/da0d9447aaf8da52231d7a702dfef63163f12510/evalscope/backend/rag_eval/ragas/tasks/testset_generation.py#L164
不行,每次到Generating Scenarios: 0%| 这一步立马报错,前面都是可以的
Task exception was never retrieved
future: <Task finished name='Task-334' coro=<as_completed.
我这边没有复现这个问题,可能是ragas的问题,可以给他们提issue反馈一下 https://github.com/explodinggradients/ragas/issues
请问你这边的ragas版本是多少
最新版本0.2.3
2块46G的L20 GPU,脚本这样指定就行了吗
os.environ['CUDA_VISIBLE_DEVICES'] = '6,7'
卡住了
建议用vllm来启动推理服务,使用对应的url,可能快一点
llm 用URL,embeddings 用本地的?
是的,embedding本地加载即可,embedding计算比较快
File "/home/alg/anaconda3/envs/evalscope/lib/python3.10/site-packages/evalscope/backend/rag_eval/init.py", line 1, in
from evalscope.backend.rag_eval.utils.embedding import EmbeddingModel
ModuleNotFoundError: No module named 'evalscope.backend.rag_eval.utils'