kaixindelele / ChatPaper

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
https://chatwithpaper.org
Other
18.42k stars 1.93k forks source link

OpenAI接口对提问的token数进行限制 #157

Closed EachSheep closed 1 year ago

EachSheep commented 1 year ago

猜测原因:OpenAI接口似乎对提问的token数进行了限制。

我在调取最新的arXiv文章时出现以下错误:

python chat_paper.py --query "all: xx xx 2023" --filter_keys "xx xx" --max_results 100
Traceback (most recent call last):                                                                                                                            
  File "/home/xx/sbin/ChatPaper/chat_paper.py", line 531, in <module>                                                                                
    chat_paper_main(args=paper_args)                                                                                                                          
  File "/home/xx/sbin/ChatPaper/chat_paper.py", line 505, in chat_paper_main                                                                         
    reader1.summary_with_chat(paper_list=paper_list)                                                                                                          
  File "/home/xx/sbin/ChatPaper/chat_paper.py", line 247, in summary_with_chat
    chat_method_text = self.chat_method(text=text, method_prompt_token=method_prompt_token)
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 325, in iter
    raise retry_exc.reraise()
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 158, in reraise
    raise self.last_attempt.result()
  File "/opt/anaconda/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/opt/anaconda/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
  File "/home/xx/sbin/ChatPaper/chat_paper.py", line 378, in chat_method
    response = openai.ChatCompletion.create(
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_requestor.py", line 679, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 7726 tokens. Please reduce the length of the messages

是否有办法解决?

kaixindelele commented 1 year ago

--max_results 100 一般来说太长了,中间出现一个文档错误,就全寄了。token长度那个,其实有try except的方案,但是我目前不想加,因为比较麻烦,会带来新的bug,所以一直没加,一般遇到这种情况,建议换一篇文章就好了。目前的脚本可以处理9成左右的文档。最难顶的是pdf的格式太乱了,改了A,可能会有B的问题。

EachSheep commented 1 year ago

--max_results 100 一般来说太长了,中间出现一个文档错误,就全寄了。token长度那个,其实有try except的方案,但是我目前不想加,因为比较麻烦,会带来新的bug,所以一直没加,一般遇到这种情况,建议换一篇文章就好了。目前的脚本可以处理9成左右的文档。最难顶的是pdf的格式太乱了,改了A,可能会有B的问题。

感谢感谢,我其实在issue里面看到相关问题的回答了,只是觉得在搜索arXiv文章时断掉不能续接怪怪的。

既然加上太多的try会出现bug,不如先考虑加个try,然后直接跳过这篇文章?反正看arXiv文章一般漏掉个几篇问题也不大。

EachSheep commented 1 year ago

--max_results 100 一般来说太长了,中间出现一个文档错误,就全寄了。token长度那个,其实有try except的方案,但是我目前不想加,因为比较麻烦,会带来新的bug,所以一直没加,一般遇到这种情况,建议换一篇文章就好了。目前的脚本可以处理9成左右的文档。最难顶的是pdf的格式太乱了,改了A,可能会有B的问题。

感谢感谢,我其实在issue里面看到相关问题的回答了,只是觉得在搜索arXiv文章时断掉不能续接怪怪的。

既然加上太多的try会出现bug,不如先考虑加个try,然后直接跳过这篇文章?反正看arXiv文章一般漏掉个几篇问题也不大。

如果您愿意的话我可以提个pr,但是感觉需要加的代码行数不是很多,可能也就几行,感觉没必要提pr了,您是否考虑一下?

kaixindelele commented 1 year ago

确实也有道理,毕竟批量检索,出现问题卡住,很难受,而且再续也有问题。我回头还是加个try吧,大概一两天内

EachSheep commented 1 year ago

感谢!

On Thu, Apr 6, 2023 at 12:35 AM kaixindelele @.***> wrote:

确实也有道理,毕竟批量检索,出现问题卡住,很难受,而且再续也有问题。我回头还是加个try吧,大概一两天内

— Reply to this email directly, view it on GitHub https://github.com/kaixindelele/ChatPaper/issues/157#issuecomment-1497793057, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTWEXAZZZJX7NVJ27CASWDW7WNLRANCNFSM6AAAAAAWUJKOLA . You are receiving this because you authored the thread.Message ID: @.***>