OpenAI接口对提问的token数进行限制

EachSheep commented 1 year ago

猜测原因：OpenAI接口似乎对提问的token数进行了限制。

我在调取最新的arXiv文章时出现以下错误：

命令为：

python chat_paper.py --query "all: xx xx 2023" --filter_keys "xx xx" --max_results 100

输出为：

Traceback (most recent call last):                                                                                                                            
  File "/home/xx/sbin/ChatPaper/chat_paper.py", line 531, in <module>                                                                                
    chat_paper_main(args=paper_args)                                                                                                                          
  File "/home/xx/sbin/ChatPaper/chat_paper.py", line 505, in chat_paper_main                                                                         
    reader1.summary_with_chat(paper_list=paper_list)                                                                                                          
  File "/home/xx/sbin/ChatPaper/chat_paper.py", line 247, in summary_with_chat
    chat_method_text = self.chat_method(text=text, method_prompt_token=method_prompt_token)
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 325, in iter
    raise retry_exc.reraise()
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 158, in reraise
    raise self.last_attempt.result()
  File "/opt/anaconda/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/opt/anaconda/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/opt/anaconda/lib/python3.9/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
  File "/home/xx/sbin/ChatPaper/chat_paper.py", line 378, in chat_method
    response = openai.ChatCompletion.create(
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_requestor.py", line 226, in request
    resp, got_stream = self._interpret_response(result, stream)
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_requestor.py", line 619, in _interpret_response
    self._interpret_response_line(
  File "/opt/anaconda/lib/python3.9/site-packages/openai/api_requestor.py", line 679, in _interpret_response_line
    raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 7726 tokens. Please reduce the length of the messages

是否有办法解决？

kaixindelele commented 1 year ago

--max_results 100 一般来说太长了，中间出现一个文档错误，就全寄了。token长度那个，其实有try except的方案，但是我目前不想加，因为比较麻烦，会带来新的bug，所以一直没加，一般遇到这种情况，建议换一篇文章就好了。目前的脚本可以处理9成左右的文档。最难顶的是pdf的格式太乱了，改了A，可能会有B的问题。

EachSheep commented 1 year ago

--max_results 100 一般来说太长了，中间出现一个文档错误，就全寄了。token长度那个，其实有try except的方案，但是我目前不想加，因为比较麻烦，会带来新的bug，所以一直没加，一般遇到这种情况，建议换一篇文章就好了。目前的脚本可以处理9成左右的文档。最难顶的是pdf的格式太乱了，改了A，可能会有B的问题。

感谢感谢，我其实在issue里面看到相关问题的回答了，只是觉得在搜索arXiv文章时断掉不能续接怪怪的。

既然加上太多的try会出现bug，不如先考虑加个try，然后直接跳过这篇文章？反正看arXiv文章一般漏掉个几篇问题也不大。

EachSheep commented 1 year ago

--max_results 100 一般来说太长了，中间出现一个文档错误，就全寄了。token长度那个，其实有try except的方案，但是我目前不想加，因为比较麻烦，会带来新的bug，所以一直没加，一般遇到这种情况，建议换一篇文章就好了。目前的脚本可以处理9成左右的文档。最难顶的是pdf的格式太乱了，改了A，可能会有B的问题。

感谢感谢，我其实在issue里面看到相关问题的回答了，只是觉得在搜索arXiv文章时断掉不能续接怪怪的。

既然加上太多的try会出现bug，不如先考虑加个try，然后直接跳过这篇文章？反正看arXiv文章一般漏掉个几篇问题也不大。

如果您愿意的话我可以提个pr，但是感觉需要加的代码行数不是很多，可能也就几行，感觉没必要提pr了，您是否考虑一下？

kaixindelele commented 1 year ago

确实也有道理，毕竟批量检索，出现问题卡住，很难受，而且再续也有问题。我回头还是加个try吧，大概一两天内

EachSheep commented 1 year ago

感谢！

On Thu, Apr 6, 2023 at 12:35 AM kaixindelele @.***> wrote:

确实也有道理，毕竟批量检索，出现问题卡住，很难受，而且再续也有问题。我回头还是加个try吧，大概一两天内

— Reply to this email directly, view it on GitHub https://github.com/kaixindelele/ChatPaper/issues/157#issuecomment-1497793057, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTWEXAZZZJX7NVJ27CASWDW7WNLRANCNFSM6AAAAAAWUJKOLA . You are receiving this because you authored the thread.Message ID: @.***>

kaixindelele / ChatPaper

OpenAI接口对提问的token数进行限制 #157