netease-youdao / QAnything

Question and Answer based on Anything.
https://qanything.ai
GNU Affero General Public License v3.0
11.3k stars 1.09k forks source link

[BUG] <title> RuntimeError: generator ignored GeneratorExit #435

Open Jun2Hou opened 1 month ago

Jun2Hou commented 1 month ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

已经修改llm_for_openai_api.py为openai接口,并且已经通过本地测试。使用docker启动(命令为“bash ./run.sh -c cloud -i 0”),streaming=False,服务启动正常,前端展示正常,可以正常新建知识库。

但是在问问题后,会一直反复调用sanic_api.py中local_doc_chat接口,前端一开始没有报错,然后查询sanic_api.log, 可以看到openai正常返回了结果(ChatCompletion(id='cmpl-782b8d6783034163b48abec915a07346',包含了content的正常结果 )。然后报下面错误:

Exception ignored in: <generator object OpenAILLM._call at 0x7f881c11b610>
Traceback (most recent call last):
  File "/workspace/qanything_local/qanything_kernel/core/local_doc_qa.py", line 265, in get_knowledge_based_answer
    yield response, history
RuntimeError: generator ignored GeneratorExit

然后程序继续调用local_doc_chat接口,重新调用openai接口并且结果返回正常,RuntimeError: generator ignored GeneratorExit报错继续出现,前端也没有报错。然后程序继续重复以上调用,直到出现如下错误: Error code: 429 - {'error': {'message': 'Your account ************* request reached max concurrency: 1, please try again after 1 seconds', 'type': 'rate_limit_reached_error'}}

请问大家怎么解决!!

期望行为 | Expected Behavior

No response

运行环境 | Environment

- OS:
- NVIDIA Driver:
- CUDA:
- docker:
- docker-compose:
- NVIDIA GPU:
- NVIDIA GPU Memory:

QAnything日志 | QAnything logs

Exception ignored in: <generator object OpenAILLM._call at 0x7f881c11aea0> Traceback (most recent call last): File "/workspace/qanything_local/qanything_kernel/core/local_doc_qa.py", line 265, in get_knowledge_based_answer yield response, history RuntimeError: generator ignored GeneratorExit

复现方法 | Steps To Reproduce

  1. 适配llm_for_openai_api.py为openai接口
  2. docker 启动 bash ./run.sh -c cloud -i 0
  3. 前端提问

备注 | Anything else?

No response

Jun2Hou commented 1 month ago

超时 问题导致原因:后端调用openai服务接口超过请求次数,从而导致超时,前端超时重新请求,导致陷入无限循环。 请问在docker版本,一次前端请求local_qa_chat接口,sanic后端一般会调用多少次openai接口?这个逻辑在哪里可以修改? 或者说前端retry次数怎么修改?

xixihahaliu commented 1 month ago

@Jun2Hou 理论上来说前端调用一次问答,只会调用一次local_qa_chat接口,后端只会调用一次openai,但是docker的前端代码目前有bug,python版的前端已修复这个问题,docker版暂无排期,我们预计下周发布v2.0版本,合并docker和python版本,可以到时候再体验一下

Jun2Hou commented 1 month ago

@Jun2Hou 理论上来说前端调用一次问答,只会调用一次local_qa_chat接口,后端只会调用一次openai,但是docker的前端代码目前有bug,python版的前端已修复这个问题,docker版暂无排期,我们预计下周发布v2.0版本,合并docker和python版本,可以到时候再体验一下

@xixihahaliu 感谢告知,已经切换到python版本。期待v2.0版本!