Future-House / paper-qa

High accuracy RAG for answering questions from scientific documents with citations
Apache License 2.0
6.36k stars 601 forks source link

LLM issue #456

Closed FahaxikiHoney closed 1 month ago

FahaxikiHoney commented 1 month ago

When I run the following code.

from paperqa import Settings, ask import os os.environ["OPENAI_API_KEY"] = "EMPTY" answer = ask( "What manufacturing challenges are unique to bispecific antibodies?", settings=Settings(temperature=0.5, paper_directory="my_papers"), )

raise APIError( litellm.exceptions.APIError: litellm.APIError: APIError: OpenAIException - Connection error. Received Model Group=gpt-4o-2024-08-06 Available Model Group Fallbacks=None LiteLLM Retried: 2 times, LiteLLM Max Retries: 3

dosubot[bot] commented 1 month ago

I found a similar open issue that might be relevant to your problem: Incomplete documentation for local usage [1]. This issue was referenced by another issue: Missing question error when using liteLLM models [1].

To resolve the APIError connection error when using the ask function from the paperqa library with OpenAI's API, you should adjust the logging configuration to get more detailed information about the error. The configure_cli_logging function in the paperqa library allows you to set different verbosity levels for logging.

Here is how you can modify the verbosity level to get more detailed logs, which can help in diagnosing the connection error:

from paperqa.settings import Settings
from paperqa.agents import ask

# Create a Settings object with increased verbosity
settings = Settings(verbosity=3)  # Set verbosity to 3 for detailed debug logs

# Your query
query = "What manufacturing challenges are unique to bispecific antibodies?"

# Call the ask function with the query and settings
try:
    response = ask(query, settings)
    print(response)
except Exception as e:
    print(f"An error occurred: {e}")

By setting verbosity=3, you enable detailed debug logs, including every single LLM call, which can provide more insights into the connection error with OpenAI's API [2].

To continue talking to Dosu, mention @dosu.

CGH20171006 commented 1 month ago

I found a similar open issue that might be relevant to your problem: Incomplete documentation for local usage [1]. This issue was referenced by another issue: Missing question error when using liteLLM models [1].

To resolve the APIError connection error when using the ask function from the paperqa library with OpenAI's API, you should adjust the logging configuration to get more detailed information about the error. The configure_cli_logging function in the paperqa library allows you to set different verbosity levels for logging.

Here is how you can modify the verbosity level to get more detailed logs, which can help in diagnosing the connection error:

from paperqa.settings import Settings
from paperqa.agents import ask

# Create a Settings object with increased verbosity
settings = Settings(verbosity=3)  # Set verbosity to 3 for detailed debug logs

# Your query
query = "What manufacturing challenges are unique to bispecific antibodies?"

# Call the ask function with the query and settings
try:
    response = ask(query, settings)
    print(response)
except Exception as e:
    print(f"An error occurred: {e}")

By setting verbosity=3, you enable detailed debug logs, including every single LLM call, which can provide more insights into the connection error with OpenAI's API [2].

To continue talking to Dosu, mention @dosu.

PaperQA version: 5.0.7
[12:58:01] Beginning agent 'fake' run with question 'What manufacturing challenges are unique to bispecific antibodies?' and full query {'query': 'What manufacturing challenges are unique to bispecific
           antibodies?', 'id': UUID('24fbf16f-7efb-4ac5-90ba-b9fce5277dd2'), 'settings_template': None, 'settings': {'llm': 'gpt-4o-mini', 'llm_config': None, 'summary_llm': 'gpt-4o-mini', 'summary_llm_config':
           None, 'embedding': 'text-embedding-3-small', 'embedding_config': None, 'temperature': 0.0, 'batch_size': 1, 'texts_index_mmr_lambda': 1.0, 'index_absolute_directory': False, 'index_directory':
           WindowsPath('C:/Users/20171006/.pqa/indexes'), 'index_recursively': True, 'verbosity': 3, 'manifest_file': None, 'paper_directory': 'D:\\Programing\\paper-qa', 'answer': {'evidence_k': 10,
           'evidence_detailed_citations': True, 'evidence_retrieval': True, 'evidence_summary_length': 'about 100 words', 'evidence_skip_summary': False, 'answer_max_sources': 5, 'answer_length': 'about 200
           words, but can be longer', 'max_concurrent_requests': 4, 'answer_filter_extra_background': False}, 'parsing': {'chunk_size': 3000, 'use_doc_details': True, 'overlap': 100, 'citation_prompt': 'Provide
           the citation for the following text in MLA Format. Do not write an introductory sentence. If reporting date accessed, the current year is 2024\n\n{text}\n\nCitation:', 'structured_citation_prompt':
           "Extract the title, authors, and doi as a JSON from this MLA citation. If any field can not be found, return it as null. Use title, authors, and doi as keys, author's value should be a list of
           authors. {citation}\n\nCitation JSON:", 'disable_doc_valid_check': False, 'chunking_algorithm': <ChunkingOptions.SIMPLE_OVERLAP: 'simple_overlap'>}, 'prompts': {'summary': 'Summarize the excerpt
           below to help answer a question.\n\nExcerpt from {citation}\n\n----\n\n{text}\n\n----\n\nQuestion: {question}\n\nDo not directly answer the question, instead summarize to give evidence to help answer
           the question. Stay detailed; report specific numbers, equations, or direct quotes (marked with quotation marks). Reply "Not applicable" if the excerpt is irrelevant. At the end of your response,
           provide an integer score from 1-10 on a newline indicating relevance to question. Do not explain your score.\n\nRelevant Information Summary ({summary_length}):', 'qa': 'Answer the question below
           with the context.\n\nContext (with relevance scores):\n\n{context}\n\n----\n\nQuestion: {question}\n\nWrite an answer based on the context. If the context provides insufficient information reply "I
           cannot answer."For each part of your answer, indicate which sources most support it via citation keys at the end of sentences, like {example_citation}. Only cite from the context below and only use
           the valid keys. Write in the style of a Wikipedia article, with concise sentences and coherent paragraphs. The context comes from a variety of sources and is only a summary, so there may inaccuracies
           or ambiguities. If quotes are present and relevant, use them in the answer. This answer will go directly onto Wikipedia, so do not add any extraneous information.\n\nAnswer ({answer_length}):',
           'select': 'Select papers that may help answer the question below. Papers are listed as $KEY: $PAPER_INFO. Return a list of keys, separated by commas. Return "None", if no papers are applicable.
           Choose papers that are relevant, from reputable sources, and timely (if the question requires timely information).\n\nQuestion: {question}\n\nPapers: {papers}\n\nSelected keys:', 'pre': None, 'post':
           None, 'system': 'Answer in a direct and concise tone. Your audience is an expert, so be highly specific. If there are ambiguous terms or acronyms, first define them.', 'use_json': False,
           'summary_json': 'Excerpt from {citation}\n\n----\n\n{text}\n\n----\n\nQuestion: {question}\n\n', 'summary_json_system': 'Provide a summary of the relevant information that could help answer the
           question based on the excerpt. Respond with the following JSON format:\n\n{{\n  "summary": "...",\n  "relevance_score": "..."\n}}\n\nwhere `summary` is relevant information from text -
           {summary_length} words and `relevance_score` is the relevance of `summary` to answer question (out of 10).\n'}, 'agent': {'agent_llm': 'gpt-4o-2024-08-06', 'agent_llm_config': None, 'agent_type':
           'fake', 'agent_config': None, 'agent_system_prompt': 'You are a helpful AI assistant.', 'agent_prompt': 'Use the tools to answer the question: {question}\n\nThe {gen_answer_tool_name} tool output is
           visible to the user, so you do not need to restate the answer and can simply terminate if the answer looks sufficient. The current status of evidence/papers/cost is {status}',
           'return_paper_metadata': False, 'search_count': 8, 'wipe_context_on_answer_failure': True, 'timeout': 500.0, 'should_pre_search': False, 'tool_names': None, 'index_concurrency': 30}, 'md5':
           'd9b11506128c475035509ae3cfc1addb'}, 'docs_name': None}.

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

An error occurred: litellm.APIError: APIError: OpenAIException - Connection error.
Received Model Group=gpt-4o-mini
Available Model Group Fallbacks=None LiteLLM Retried: 2 times, LiteLLM Max Retries: 3
FahaxikiHoney commented 1 month ago

我发现了一个类似的未解决的问题,它可能与您的问题相关:本地使用的文档不完整 [1]。此问题由另一个问题引用:使用 liteLLM 模型时出现 Missing question 错误 [1]。 要解决在将库中的函数与 OpenAI 的 API 一起使用时的连接错误,您应该调整日志记录配置以获取有关错误的更多详细信息。库中的函数允许您为日志记录设置不同的详细级别。APIError``ask``paperqa``configure_cli_logging``paperqa 以下是修改详细级别以获取更详细日志的方法,这有助于诊断连接错误:

from paperqa.settings import Settings
from paperqa.agents import ask

# Create a Settings object with increased verbosity
settings = Settings(verbosity=3)  # Set verbosity to 3 for detailed debug logs

# Your query
query = "What manufacturing challenges are unique to bispecific antibodies?"

# Call the ask function with the query and settings
try:
    response = ask(query, settings)
    print(response)
except Exception as e:
    print(f"An error occurred: {e}")

通过设置 ,您可以启用详细的调试日志,包括每个 LLM 调用,这可以更深入地了解 OpenAI 的 API [2] 的连接错误。verbosity=3 要继续与 Dosu 交谈,请提及 。

PaperQA version: 5.0.7
[12:58:01] Beginning agent 'fake' run with question 'What manufacturing challenges are unique to bispecific antibodies?' and full query {'query': 'What manufacturing challenges are unique to bispecific
           antibodies?', 'id': UUID('24fbf16f-7efb-4ac5-90ba-b9fce5277dd2'), 'settings_template': None, 'settings': {'llm': 'gpt-4o-mini', 'llm_config': None, 'summary_llm': 'gpt-4o-mini', 'summary_llm_config':
           None, 'embedding': 'text-embedding-3-small', 'embedding_config': None, 'temperature': 0.0, 'batch_size': 1, 'texts_index_mmr_lambda': 1.0, 'index_absolute_directory': False, 'index_directory':
           WindowsPath('C:/Users/20171006/.pqa/indexes'), 'index_recursively': True, 'verbosity': 3, 'manifest_file': None, 'paper_directory': 'D:\\Programing\\paper-qa', 'answer': {'evidence_k': 10,
           'evidence_detailed_citations': True, 'evidence_retrieval': True, 'evidence_summary_length': 'about 100 words', 'evidence_skip_summary': False, 'answer_max_sources': 5, 'answer_length': 'about 200
           words, but can be longer', 'max_concurrent_requests': 4, 'answer_filter_extra_background': False}, 'parsing': {'chunk_size': 3000, 'use_doc_details': True, 'overlap': 100, 'citation_prompt': 'Provide
           the citation for the following text in MLA Format. Do not write an introductory sentence. If reporting date accessed, the current year is 2024\n\n{text}\n\nCitation:', 'structured_citation_prompt':
           "Extract the title, authors, and doi as a JSON from this MLA citation. If any field can not be found, return it as null. Use title, authors, and doi as keys, author's value should be a list of
           authors. {citation}\n\nCitation JSON:", 'disable_doc_valid_check': False, 'chunking_algorithm': <ChunkingOptions.SIMPLE_OVERLAP: 'simple_overlap'>}, 'prompts': {'summary': 'Summarize the excerpt
           below to help answer a question.\n\nExcerpt from {citation}\n\n----\n\n{text}\n\n----\n\nQuestion: {question}\n\nDo not directly answer the question, instead summarize to give evidence to help answer
           the question. Stay detailed; report specific numbers, equations, or direct quotes (marked with quotation marks). Reply "Not applicable" if the excerpt is irrelevant. At the end of your response,
           provide an integer score from 1-10 on a newline indicating relevance to question. Do not explain your score.\n\nRelevant Information Summary ({summary_length}):', 'qa': 'Answer the question below
           with the context.\n\nContext (with relevance scores):\n\n{context}\n\n----\n\nQuestion: {question}\n\nWrite an answer based on the context. If the context provides insufficient information reply "I
           cannot answer."For each part of your answer, indicate which sources most support it via citation keys at the end of sentences, like {example_citation}. Only cite from the context below and only use
           the valid keys. Write in the style of a Wikipedia article, with concise sentences and coherent paragraphs. The context comes from a variety of sources and is only a summary, so there may inaccuracies
           or ambiguities. If quotes are present and relevant, use them in the answer. This answer will go directly onto Wikipedia, so do not add any extraneous information.\n\nAnswer ({answer_length}):',
           'select': 'Select papers that may help answer the question below. Papers are listed as $KEY: $PAPER_INFO. Return a list of keys, separated by commas. Return "None", if no papers are applicable.
           Choose papers that are relevant, from reputable sources, and timely (if the question requires timely information).\n\nQuestion: {question}\n\nPapers: {papers}\n\nSelected keys:', 'pre': None, 'post':
           None, 'system': 'Answer in a direct and concise tone. Your audience is an expert, so be highly specific. If there are ambiguous terms or acronyms, first define them.', 'use_json': False,
           'summary_json': 'Excerpt from {citation}\n\n----\n\n{text}\n\n----\n\nQuestion: {question}\n\n', 'summary_json_system': 'Provide a summary of the relevant information that could help answer the
           question based on the excerpt. Respond with the following JSON format:\n\n{{\n  "summary": "...",\n  "relevance_score": "..."\n}}\n\nwhere `summary` is relevant information from text -
           {summary_length} words and `relevance_score` is the relevance of `summary` to answer question (out of 10).\n'}, 'agent': {'agent_llm': 'gpt-4o-2024-08-06', 'agent_llm_config': None, 'agent_type':
           'fake', 'agent_config': None, 'agent_system_prompt': 'You are a helpful AI assistant.', 'agent_prompt': 'Use the tools to answer the question: {question}\n\nThe {gen_answer_tool_name} tool output is
           visible to the user, so you do not need to restate the answer and can simply terminate if the answer looks sufficient. The current status of evidence/papers/cost is {status}',
           'return_paper_metadata': False, 'search_count': 8, 'wipe_context_on_answer_failure': True, 'timeout': 500.0, 'should_pre_search': False, 'tool_names': None, 'index_concurrency': 30}, 'md5':
           'd9b11506128c475035509ae3cfc1addb'}, 'docs_name': None}.

�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

An error occurred: litellm.APIError: APIError: OpenAIException - Connection error.
Received Model Group=gpt-4o-mini
Available Model Group Fallbacks=None LiteLLM Retried: 2 times, LiteLLM Max Retries: 3

Hi,You must be Chinese, right? Can we communicate in detail privately?,Thank you very much!

jamesbraza commented 1 month ago

@FahaxikiHoney we are also impacted by this, it's traced to https://github.com/BerriAI/litellm/issues/5854.

Please pip install openai<1.47 and you will get around this error

CGH20171006 commented 1 month ago

我发现了一个类似的未解决的问题,它可能与您的问题相关:本地使用的文档不完整 [1]。此问题由另一个问题引用:使用 liteLLM 模型时出现 Missing question 错误 [1]。 要解决在将库中的函数与 OpenAI 的 API 一起使用时的连接错误,您应该调整日志记录配置以获取有关错误的更多详细信息。库中的函数允许您为日志记录设置不同的详细级别。APIErroraskpaperqaconfigure_cli_loggingpaperqa 以下是修改详细级别以获取更详细日志的方法,这有助于诊断连接错误:

from paperqa.settings import Settings
from paperqa.agents import ask

# Create a Settings object with increased verbosity
settings = Settings(verbosity=3)  # Set verbosity to 3 for detailed debug logs

# Your query
query = "What manufacturing challenges are unique to bispecific antibodies?"

# Call the ask function with the query and settings
try:
    response = ask(query, settings)
    print(response)
except Exception as e:
    print(f"An error occurred: {e}")

通过设置 ,您可以启用详细的调试日志,包括每个 LLM 调用,这可以更深入地了解 OpenAI 的 API [2] 的连接错误。verbosity=3 要继续与 Dosu 交谈,请提及 。

PaperQA version: 5.0.7
[12:58:01] Beginning agent 'fake' run with question 'What manufacturing challenges are unique to bispecific antibodies?' and full query {'query': 'What manufacturing challenges are unique to bispecific
           antibodies?', 'id': UUID('24fbf16f-7efb-4ac5-90ba-b9fce5277dd2'), 'settings_template': None, 'settings': {'llm': 'gpt-4o-mini', 'llm_config': None, 'summary_llm': 'gpt-4o-mini', 'summary_llm_config':
           None, 'embedding': 'text-embedding-3-small', 'embedding_config': None, 'temperature': 0.0, 'batch_size': 1, 'texts_index_mmr_lambda': 1.0, 'index_absolute_directory': False, 'index_directory':
           WindowsPath('C:/Users/20171006/.pqa/indexes'), 'index_recursively': True, 'verbosity': 3, 'manifest_file': None, 'paper_directory': 'D:\\Programing\\paper-qa', 'answer': {'evidence_k': 10,
           'evidence_detailed_citations': True, 'evidence_retrieval': True, 'evidence_summary_length': 'about 100 words', 'evidence_skip_summary': False, 'answer_max_sources': 5, 'answer_length': 'about 200
           words, but can be longer', 'max_concurrent_requests': 4, 'answer_filter_extra_background': False}, 'parsing': {'chunk_size': 3000, 'use_doc_details': True, 'overlap': 100, 'citation_prompt': 'Provide
           the citation for the following text in MLA Format. Do not write an introductory sentence. If reporting date accessed, the current year is 2024\n\n{text}\n\nCitation:', 'structured_citation_prompt':
           "Extract the title, authors, and doi as a JSON from this MLA citation. If any field can not be found, return it as null. Use title, authors, and doi as keys, author's value should be a list of
           authors. {citation}\n\nCitation JSON:", 'disable_doc_valid_check': False, 'chunking_algorithm': <ChunkingOptions.SIMPLE_OVERLAP: 'simple_overlap'>}, 'prompts': {'summary': 'Summarize the excerpt
           below to help answer a question.\n\nExcerpt from {citation}\n\n----\n\n{text}\n\n----\n\nQuestion: {question}\n\nDo not directly answer the question, instead summarize to give evidence to help answer
           the question. Stay detailed; report specific numbers, equations, or direct quotes (marked with quotation marks). Reply "Not applicable" if the excerpt is irrelevant. At the end of your response,
           provide an integer score from 1-10 on a newline indicating relevance to question. Do not explain your score.\n\nRelevant Information Summary ({summary_length}):', 'qa': 'Answer the question below
           with the context.\n\nContext (with relevance scores):\n\n{context}\n\n----\n\nQuestion: {question}\n\nWrite an answer based on the context. If the context provides insufficient information reply "I
           cannot answer."For each part of your answer, indicate which sources most support it via citation keys at the end of sentences, like {example_citation}. Only cite from the context below and only use
           the valid keys. Write in the style of a Wikipedia article, with concise sentences and coherent paragraphs. The context comes from a variety of sources and is only a summary, so there may inaccuracies
           or ambiguities. If quotes are present and relevant, use them in the answer. This answer will go directly onto Wikipedia, so do not add any extraneous information.\n\nAnswer ({answer_length}):',
           'select': 'Select papers that may help answer the question below. Papers are listed as $KEY: $PAPER_INFO. Return a list of keys, separated by commas. Return "None", if no papers are applicable.
           Choose papers that are relevant, from reputable sources, and timely (if the question requires timely information).\n\nQuestion: {question}\n\nPapers: {papers}\n\nSelected keys:', 'pre': None, 'post':
           None, 'system': 'Answer in a direct and concise tone. Your audience is an expert, so be highly specific. If there are ambiguous terms or acronyms, first define them.', 'use_json': False,
           'summary_json': 'Excerpt from {citation}\n\n----\n\n{text}\n\n----\n\nQuestion: {question}\n\n', 'summary_json_system': 'Provide a summary of the relevant information that could help answer the
           question based on the excerpt. Respond with the following JSON format:\n\n{{\n  "summary": "...",\n  "relevance_score": "..."\n}}\n\nwhere `summary` is relevant information from text -
           {summary_length} words and `relevance_score` is the relevance of `summary` to answer question (out of 10).\n'}, 'agent': {'agent_llm': 'gpt-4o-2024-08-06', 'agent_llm_config': None, 'agent_type':
           'fake', 'agent_config': None, 'agent_system_prompt': 'You are a helpful AI assistant.', 'agent_prompt': 'Use the tools to answer the question: {question}\n\nThe {gen_answer_tool_name} tool output is
           visible to the user, so you do not need to restate the answer and can simply terminate if the answer looks sufficient. The current status of evidence/papers/cost is {status}',
           'return_paper_metadata': False, 'search_count': 8, 'wipe_context_on_answer_failure': True, 'timeout': 500.0, 'should_pre_search': False, 'tool_names': None, 'index_concurrency': 30}, 'md5':
           'd9b11506128c475035509ae3cfc1addb'}, 'docs_name': None}.

�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

�[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new�[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

An error occurred: litellm.APIError: APIError: OpenAIException - Connection error.
Received Model Group=gpt-4o-mini
Available Model Group Fallbacks=None LiteLLM Retried: 2 times, LiteLLM Max Retries: 3

Hi,You must be Chinese, right? Can we communicate in detail privately?,Thank you very much!

Haha, yes, I'm Chinese. You can always communicate with me via email. cgh20171006@njtech.edu.cn

FahaxikiHoney commented 1 month ago

我们也受到了影响,它被追溯到 BerriAI/litellm#5854

请,您将绕过此错误pip install openai<1.47

from paperqa import Settings, ask import os os.environ["OPENAI_API_KEY"] = "EMPTY" local_llm_config = { "model_list": [ { "model_name": "ollama/llama3", "litellm_params": { "model": "ollama/llama3", "api_base": ""https://ap" } } ] }

answer = ask( "What manufacturing challenges are unique to bispecific antibodies?", settings=Settings( llm="ollama/llama3", llm_config=local_llm_config, summary_llm="ollama/llama3", summary_llm_config=local_llm_config, ), )

I want to change the GPT model to llama3, the code is as above, but an error occurred when running it, and the error is as follows.

raise client_error(req.connection_key, exc) from excaiohttp.client_exceptions.ClientConnectorError: Cannot connect to host localhost:11434 ssl:default [远程计算机拒绝网络连接。]

Received Model Group=ollama/llama3 Available Model Group Fallbacks=None LiteLLM Retried: 2 times, LiteLLM Max Retries: 3