Alpaca4610 / nonebot_plugin_chatpdf

在nonebot中使用ChatPDF的最小实现来中分析上传的文章内容
56 stars 4 forks source link

大佬,问两句就达到tokens上限了,怎么办 #1

Open gitxxp opened 1 year ago

gitxxp commented 1 year ago

This model's maximum context length is 4097 tokens. However, your messages resulted in 4279 tokens. Please reduce the length of the messages. 另外,nonebot的指令本就需要打/,加上预设指令就变成了两个//,实际需要输入//start,README.md容易产生误导 以及能否支持上传pdf功能

Alpaca4610 commented 1 year ago

model tokens上限可能是和文章的长度有关,好像免费版的api会有一个长度限制。我最近也在研究如何分割文章解决这个问题。关于上传分析功能,核心的逻辑代码是支持上传文件进行分析的,但是我一直没找到一个在nonebot框架下上传文件传递到具体的业务逻辑代码处理的方法。之前初步构想的文件上传方法是发送文件的直链让机器人下载分析。关于指令的问题,好像nonebot可以在.env文件里面设置统一的命令触发前缀,我没有设置所以测试的时候直接发送就行,不过也感谢建议,我去修改一下README文档。

gitxxp commented 1 year ago

nonebot-plugin-backup这个插件似乎能做到传递文件,不知是否可供参考.

Alpaca4610 commented 1 year ago

nonebot-plugin-backup这个插件似乎能做到传递文件,不知是否可供参考.

好,我看看

Alpaca4610 commented 1 year ago

nonebot-plugin-backup这个插件似乎能做到传递文件,不知是否可供参考.

已更新插件,添加将txt文件上传到群文件分析的功能。未来可能添加pdf转txt的方式。

Alpaca4610 commented 1 year ago

已加入pdf文件分析功能,使用pdf分析时可以有效避免token到上限的缺点。

gitxxp commented 1 year ago

已加入pdf文件分析功能,使用pdf分析时可以有效避免token到上限的缺点。

刚试了下,在我上传了一个900kb的pdf后依旧报错

reading pdf finished
page: 0, part: 0
page: 1, part: 0
page: 2, part: 0
03-12 23:44:10 [INFO] nonebot | Matcher(type='notice', module=nonebot_plugin_chatpdf) running complete
03-12 23:44:10 [ERROR] nonebot | Running Matcher(type='notice', module=nonebot_plugin_chatpdf) failed.
Traceback (most recent call last):
  File "<string>", line 17, in <module>
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/__init__.py", line 273, in run
    get_driver().run(*args, **kwargs)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/drivers/fastapi.py", line 187, in run
    uvicorn.run(
  File "/mybot/.venv/lib/python3.11/site-packages/uvicorn/main.py", line 569, in run
    server.run()
  File "/mybot/.venv/lib/python3.11/site-packages/uvicorn/server.py", line 60, in run
    return asyncio.run(self.serve(sockets=sockets))
  File "/usr/local/python11/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
  File "/usr/local/python11/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/message.py", line 141, in _check_matcher
    await _run_matcher(Matcher, bot, event, state, stack, dependency_cache)
> File "/mybot/.venv/lib/python3.11/site-packages/nonebot/message.py", line 187, in _run_matcher
    await matcher.run(bot, event, state, stack, dependency_cache)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/internal/matcher/matcher.py", line 732, in run
    await self.simple_run(bot, event, state, stack, dependency_cache)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/internal/matcher/matcher.py", line 707, in simple_run
    await handler(
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/dependencies/__init__.py", line 108, in __call__
    return await cast(Callable[..., Awaitable[R]], self.call)(**values)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/__init__.py", line 112, in _
    summary = session.read_pdf_and_summarize(f)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/pdf_reader.py", line 111, in read_pdf_and_summarize
    summary = self.summarize(paper)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/pdf_reader.py", line 89, in summarize
    summary = self._chat('now I send you page {}, part {}:{}'.format(page_idx, part_idx, text))
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/pdf_reader.py", line 67, in _chat
    response = self.send_msg(self.messages)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/pdf_reader.py", line 59, in send_msg
    return self.model.send_msg(msg)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/model_interface.py", line 23, in send_msg
    response = openai.ChatCompletion.create(
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4319 tokens. Please reduce the length of the messages.
Alpaca4610 commented 1 year ago

已加入pdf文件分析功能,使用pdf分析时可以有效避免token到上限的缺点。

刚试了下,在我上传了一个900kb的pdf后依旧报错

reading pdf finished
page: 0, part: 0
page: 1, part: 0
page: 2, part: 0
03-12 23:44:10 [INFO] nonebot | Matcher(type='notice', module=nonebot_plugin_chatpdf) running complete
03-12 23:44:10 [ERROR] nonebot | Running Matcher(type='notice', module=nonebot_plugin_chatpdf) failed.
Traceback (most recent call last):
  File "<string>", line 17, in <module>
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/__init__.py", line 273, in run
    get_driver().run(*args, **kwargs)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/drivers/fastapi.py", line 187, in run
    uvicorn.run(
  File "/mybot/.venv/lib/python3.11/site-packages/uvicorn/main.py", line 569, in run
    server.run()
  File "/mybot/.venv/lib/python3.11/site-packages/uvicorn/server.py", line 60, in run
    return asyncio.run(self.serve(sockets=sockets))
  File "/usr/local/python11/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
  File "/usr/local/python11/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/message.py", line 141, in _check_matcher
    await _run_matcher(Matcher, bot, event, state, stack, dependency_cache)
> File "/mybot/.venv/lib/python3.11/site-packages/nonebot/message.py", line 187, in _run_matcher
    await matcher.run(bot, event, state, stack, dependency_cache)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/internal/matcher/matcher.py", line 732, in run
    await self.simple_run(bot, event, state, stack, dependency_cache)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/internal/matcher/matcher.py", line 707, in simple_run
    await handler(
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot/dependencies/__init__.py", line 108, in __call__
    return await cast(Callable[..., Awaitable[R]], self.call)(**values)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/__init__.py", line 112, in _
    summary = session.read_pdf_and_summarize(f)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/pdf_reader.py", line 111, in read_pdf_and_summarize
    summary = self.summarize(paper)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/pdf_reader.py", line 89, in summarize
    summary = self._chat('now I send you page {}, part {}:{}'.format(page_idx, part_idx, text))
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/pdf_reader.py", line 67, in _chat
    response = self.send_msg(self.messages)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/pdf_reader.py", line 59, in send_msg
    return self.model.send_msg(msg)
  File "/mybot/.venv/lib/python3.11/site-packages/nonebot_plugin_chatpdf/gpt_reader/model_interface.py", line 23, in send_msg
    response = openai.ChatCompletion.create(
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "/mybot/.venv/lib/python3.11/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4319 tokens. Please reduce the length of the messages.

这个问题我也发现了。对于中英文pdf的处理能力不一样。处理英文的pdf,十多页可以正常分析。对于中文的pdf,个位数的页数也有可能超出限制,目前正在研究代码应该如何修改。

Alpaca4610 commented 1 year ago

v0.4 更换核心算法,支持更长的PDF分析