[Bug]: `openai==1.47` breaks CI

jamesbraza commented 1 month ago

What happened?

Something about openai==1.47.0 and 1.47.1 breaks paper-qa's CI. Our CI (run) starts blowing up with the below error.

Relevant log output

litellm.exceptions.APIError: litellm.APIError: APIError: OpenAIException - Connection error.

Twitter / LinkedIn details

No response

CGH20171006 commented 1 month ago

This problem was not solved in my case by changing the version of openai： windows10+python3.11 openai-1.45.0

settings = Settings(
    llm="gpt-4o-mini",
    summary_llm="gpt-4o-mini",
    paper_directory="D:\\Programing\\paper-qa",
    verbosity=3
)

query = "What manufacturing challenges are unique to bispecific antibodies?"

try:
    response = ask(query, settings)
    print(response)
except Exception as e:
    print(f"An error occurred: {e}")

error message

PaperQA version: 5.0.7
[10:29:07] Beginning agent 'fake' run with question 'What manufacturing challenges are unique to bispecific antibodies?' and full query
           {'query': 'What manufacturing challenges are unique to bispecific antibodies?', 'id':
           UUID('562dea78-4538-496d-a77b-71a10725892a'), 'settings_template': None, 'settings': {'llm': 'gpt-4o-mini', 'llm_config': None,
           'summary_llm': 'gpt-4o-mini', 'summary_llm_config': None, 'embedding': 'text-embedding-3-small', 'embedding_config': None,
           'temperature': 0.0, 'batch_size': 1, 'texts_index_mmr_lambda': 1.0, 'index_absolute_directory': False, 'index_directory':
           WindowsPath('C:/Users/20171006/.pqa/indexes'), 'index_recursively': True, 'verbosity': 3, 'manifest_file': None,
           'paper_directory': 'D:\\Programing\\paper-qa', 'answer': {'evidence_k': 10, 'evidence_detailed_citations': True,
           'evidence_retrieval': True, 'evidence_summary_length': 'about 100 words', 'evidence_skip_summary': False, 'answer_max_sources':
           5, 'answer_length': 'about 200 words, but can be longer', 'max_concurrent_requests': 4, 'answer_filter_extra_background': False},
           'parsing': {'chunk_size': 3000, 'use_doc_details': True, 'overlap': 100, 'citation_prompt': 'Provide the citation for the
           following text in MLA Format. Do not write an introductory sentence. If reporting date accessed, the current year is
           2024\n\n{text}\n\nCitation:', 'structured_citation_prompt': "Extract the title, authors, and doi as a JSON from this MLA
           citation. If any field can not be found, return it as null. Use title, authors, and doi as keys, author's value should be a list
           of authors. {citation}\n\nCitation JSON:", 'disable_doc_valid_check': False, 'chunking_algorithm':
           <ChunkingOptions.SIMPLE_OVERLAP: 'simple_overlap'>}, 'prompts': {'summary': 'Summarize the excerpt below to help answer a
           question.\n\nExcerpt from {citation}\n\n----\n\n{text}\n\n----\n\nQuestion: {question}\n\nDo not directly answer the question,
           instead summarize to give evidence to help answer the question. Stay detailed; report specific numbers, equations, or direct
           quotes (marked with quotation marks). Reply "Not applicable" if the excerpt is irrelevant. At the end of your response, provide
           an integer score from 1-10 on a newline indicating relevance to question. Do not explain your score.\n\nRelevant Information
           Summary ({summary_length}):', 'qa': 'Answer the question below with the context.\n\nContext (with relevance
           scores):\n\n{context}\n\n----\n\nQuestion: {question}\n\nWrite an answer based on the context. If the context provides
           insufficient information reply "I cannot answer."For each part of your answer, indicate which sources most support it via
           citation keys at the end of sentences, like {example_citation}. Only cite from the context below and only use the valid keys.
           Write in the style of a Wikipedia article, with concise sentences and coherent paragraphs. The context comes from a variety of
           sources and is only a summary, so there may inaccuracies or ambiguities. If quotes are present and relevant, use them in the
           answer. This answer will go directly onto Wikipedia, so do not add any extraneous information.\n\nAnswer ({answer_length}):',
           'select': 'Select papers that may help answer the question below. Papers are listed as $KEY: $PAPER_INFO. Return a list of keys,
           separated by commas. Return "None", if no papers are applicable. Choose papers that are relevant, from reputable sources, and
           timely (if the question requires timely information).\n\nQuestion: {question}\n\nPapers: {papers}\n\nSelected keys:', 'pre':
           None, 'post': None, 'system': 'Answer in a direct and concise tone. Your audience is an expert, so be highly specific. If there
           are ambiguous terms or acronyms, first define them.', 'use_json': False, 'summary_json': 'Excerpt from
           {citation}\n\n----\n\n{text}\n\n----\n\nQuestion: {question}\n\n', 'summary_json_system': 'Provide a summary of the relevant
           information that could help answer the question based on the excerpt. Respond with the following JSON format:\n\n{{\n  "summary":
           "...",\n  "relevance_score": "..."\n}}\n\nwhere `summary` is relevant information from text - {summary_length} words and
           `relevance_score` is the relevance of `summary` to answer question (out of 10).\n'}, 'agent': {'agent_llm': 'gpt-4o-2024-08-06',
           'agent_llm_config': None, 'agent_type': 'fake', 'agent_config': None, 'agent_system_prompt': 'You are a helpful AI assistant.',
           'agent_prompt': 'Use the tools to answer the question: {question}\n\nThe {gen_answer_tool_name} tool output is visible to the
           user, so you do not need to restate the answer and can simply terminate if the answer looks sufficient. The current status of
           evidence/papers/cost is {status}', 'return_paper_metadata': False, 'search_count': 8, 'wipe_context_on_answer_failure': True,
           'timeout': 500.0, 'should_pre_search': False, 'tool_names': None, 'index_concurrency': 30}, 'md5':
           'd9b11506128c475035509ae3cfc1addb'}, 'docs_name': None}.

[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.

An error occurred: litellm.APIError: APIError: OpenAIException - Connection error.
Received Model Group=gpt-4o-mini
Available Model Group Fallbacks=None LiteLLM Retried: 2 times, LiteLLM Max Retries: 3
(paperqa) PS D:\Programing\paper-qa>

ishaan-jaff commented 1 month ago

@jamesbraza @CGH20171006 is this just on CI/CD ? Can you give us more steps to repro this

jamesbraza commented 1 month ago

I found pinning openai<1.47 fixed the issue for me reliably in https://github.com/Future-House/paper-qa/pull/466. You can try playing around with our unit tests there @ishaan-jaff, and thanks for investigating this 👍

Note you will want to delete our test cassettes (at tests/cassettes) because they cache responses and mask the issue.

CGH20171006 commented 1 month ago

@jamesbraza @CGH20171006 is this just on CI/CD ? Can you give us more steps to repro this

I use python to run this，i can give your all the code expect the api key:

import os
from paperqa import Docs,ask
from paperqa.settings import Settings, AgentSettings, AnswerSettings
os.environ["OPENAI_API_KEY"] = "sk-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
os.environ["OPENAI_API_BASE"] = "https://api.fast-tunnel.one/v1"

settings = Settings(
    llm="gpt-4o-mini",
    summary_llm="gpt-4o-mini",
    paper_directory="D:\\Programing\\paper-qa",
    verbosity=3
)

query = "What manufacturing challenges are unique to bispecific antibodies?"

try:
    response = ask(query, settings)
    print(response)
except Exception as e:
    print(f"An error occurred: {e}")

JamesHutchison commented 1 month ago

Was seeing this issue and pinning to 1.45.0 fixed it.

Whatever the cause, it happens almost immediately after a request is made, ignoring retries

I don't think this is CI related, was seeing this in a Codespace. The issue is either random / intermittent or happens on a cadence such as every other request. When I was debugging on an integration test using the pytest hot reloading daemon, I'd see this error, then see a different error further along, then see this error, then see a different error...

There is no status code

ishaan-jaff commented 1 month ago

I don't think this is CI related, was seeing this in a Codespace.

@JamesHutchison can you help me repro

what environment were you on ?
could you share code to repro ?

ishaan-jaff commented 1 month ago

I see nothing about this on OpenAI Python Releases https://github.com/openai/openai-python/releases or issues: https://github.com/openai/openai-python/issues

jamesbraza commented 1 month ago

For me in paper-qa, we don't depend on openai, just litellm, so it's some internal interaction between litellm and the openai package. This error started appearing like a switch when upgrading our indirect dependency from openai==1.46.1 to openai==1.47.0.

Also, I observe this error type seems to only happen when multiprocessing via pytest-xdist. https://github.com/BerriAI/litellm/issues/4032 is another open issue about basically LiteLLM + multiprocessing pytest-xdist + Anthropic failures.

I think the core of this issue is that something about LiteLLM's design is vulnerable to multiprocessing failures, and it seems something about the latest openai package's minor version exposes it in a new way.

whitead commented 1 month ago

I was able to get a minimal repro for this issue:

openai==1.48.0
litellm==1.48.1

import pytest
import litellm

@pytest.mark.asyncio
async def test_llm_completion1():
    await litellm.acompletion(
        model="gpt-4o-mini",
        temperature=0.5,
        messages=[
            {
                "content": "Here is a question, the correct answer to the question, and a proposed answer to the question. Please tell me if the proposed answer is correct, given the correct answer. ONLY SAY 'YES' OR 'NO'. No other output is permitted.\n\nQuestion: What is 25 * 10? \n\nCorrect answer: 250 \n\nProposed answer: 250",
                "role": "user",
            }
        ],
    )

@pytest.mark.asyncio
async def test_llm_completion2():
    await litellm.acompletion(
        model="gpt-4o-mini",
        temperature=0.5,
        messages=[
            {
                "content": "Here is a question, the correct answer to the question, and a proposed answer to the question. Please tell me if the proposed answer is correct, given the correct answer. ONLY SAY 'YES' OR 'NO'. No other output is permitted.\n\nQuestion: What is 25 * 10? \n\nCorrect answer: 250 \n\nProposed answer: 250",
                "role": "user",
            }
        ],
    )

and run with pytest

output (truncated)

FAILED tests/test_foo.py::test_llm_completion2 - litellm.exceptions.APIError: litellm.APIError: APIError: OpenAIException - Connection error.

krrishdholakia commented 1 month ago

Hey everyone, thanks for the work so far. Will investigate on our end as well.

krrishdholakia commented 1 month ago

hmm testing with openai version 1.51.0 and litellm latest i don't see the error

krrishdholakia commented 1 month ago

I believe this might've been some transient openai-sdk issue? I'm testing on litellm latest with openai v1.51.0 in a colab environment and not seeing this either

jamesbraza commented 1 month ago

With litellm==1.48.9 and openai==1.51.0, this still definitely persists and kills all our tests:

Perhaps the minimal repro needs updating

JamesHutchison commented 1 month ago

Do more than one completion

krrishdholakia commented 1 month ago

i ran 3 calls in a colab vm and i don't see this -

can someone share a colab / some notebook which repros the error?

whitead commented 1 month ago

Hi @krrishdholakia

I converted my repro into a colab

https://colab.research.google.com/drive/1arR2JTZxQioVNCgLZ-dCDKcqep-Gl8xE?usp=sharing

jamesbraza commented 1 month ago

I can confirm Andrew's Google Colab does indeed repro the issue. @krrishdholakia I also got burned by this again today, do you think you can investigate this?

krrishdholakia commented 1 month ago

        log.debug("Raising timeout error")
raise APITimeoutError(request=request) from err except Exception as err: log.debug("Encountered Exception", exc_info=True)

        if retries_taken > 0:
            return await self._retry_request(
                input_options,
                cast_to,
                retries_taken=retries_taken,
                stream=stream,
                stream_cls=stream_cls,
                response_headers=None,
            )

        log.debug("Raising connection error")

      raise APIConnectionError(request=request) from err
E openai.APIConnectionError: Connection error.

/usr/local/lib/python3.10/dist-packages/openai/_base_client.py:1598: APIConnectionError

this seems to come from openai

krrishdholakia commented 1 month ago

self = <_UnixSelectorEventLoop running=False closed=True debug=False>

def _check_closed(self):
    if self._closed:
     raise RuntimeError('Event loop is closed')

E RuntimeError: Event loop is closed

/usr/lib/python3.10/asyncio/base_events.py:515: RuntimeError

Is this somehow unique to being run in a notebook environment?

It looks like something is closing the event loop prematurely

jamesbraza commented 1 month ago

E RuntimeError: Event loop is closed

/usr/lib/python3.10/asyncio/base_events.py:515: RuntimeError

... It looks like something is closing the event loop prematurely

I think I realized why this is happening. pytest-asyncio's default loop_scope behavior is to have one event loop per test case. What I believe is happening is:

litellm has a asyncio.get_event_loop() or an asyncio.run() somewhere that attaches to the prior test's event loop
When the second test runs, litellm hits a RuntimeError: Event loop is closed because the prior test has finished and closed its loop.

So in short it's a race condition caused by litellm getting a stale (or shortly stale) event loop. Does that make sense?

It's most likely the same error here: https://github.com/BerriAI/litellm/issues/4032

jamesbraza commented 1 month ago

@krrishdholakia I am thinking the asyncio.get_event_loop happening in the first line of litellm.acompletion is the issue: https://github.com/BerriAI/litellm/blob/v1.47.2/litellm/main.py#L333

I believe to fix you'll need to move litellm.acompletion to not just run litellm.completion in the current event loop

Kakadus commented 3 weeks ago

At some point, we see this in our logs (using anthropic):

DEBUG    LiteLLM Router:router.py:2594 TracebackTraceback (most recent call last):
  File "/venv/lib/python3.12/site-packages/litellm/llms/anthropic/chat/handler.py", line 384, in acompletion_function
    response = await async_handler.post(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/litellm/llms/custom_httpx/http_handler.py", line 151, in post
    raise e
  File "/venv/lib/python3.12/site-packages/litellm/llms/custom_httpx/http_handler.py", line 112, in post
    response = await self.client.send(req, stream=stream)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/httpx/_client.py", line 1674, in send
    response = await self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/httpx/_client.py", line 1702, in _send_handling_auth
    response = await self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/httpx/_client.py", line 1739, in _send_handling_redirects
    response = await self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/httpx/_client.py", line 1776, in _send_single_request
    response = await transport.handle_async_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/httpx/_transports/default.py", line 377, in handle_async_request
    resp = await self._pool.handle_async_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
    raise exc from None
  File "/venv/lib/python3.12/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
    response = await connection.handle_async_request(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/httpcore/_async/connection.py", line 101, in handle_async_request
    return await self._connection.handle_async_request(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/httpcore/_async/http11.py", line 142, in handle_async_request
    await self._response_closed()
  File "/venv/lib/python3.12/site-packages/httpcore/_async/http11.py", line 257, in _response_closed
    await self.aclose()
  File "/venv/lib/python3.12/site-packages/httpcore/_async/http11.py", line 265, in aclose
    await self._network_stream.aclose()
  File "/venv/lib/python3.12/site-packages/httpcore/_backends/anyio.py", line 55, in aclose
    await self._stream.aclose()
  File "/venv/lib/python3.12/site-packages/anyio/streams/tls.py", line 202, in aclose
    await self.transport_stream.aclose()
  File "/venv/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 1258, in aclose
    self._transport.close()
  File "/usr/lib/python3.12/asyncio/selector_events.py", line 1210, in close
    super().close()
  File "/usr/lib/python3.12/asyncio/selector_events.py", line 875, in close
    self._loop.call_soon(self._call_connection_lost, None)
  File "/usr/lib/python3.12/asyncio/base_events.py", line 795, in call_soon
    self._check_closed()
  File "/usr/lib/python3.12/asyncio/base_events.py", line 541, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/venv/lib/python3.12/site-packages/litellm/main.py", line 470, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/venv/lib/python3.12/site-packages/litellm/llms/anthropic/chat/handler.py", line 401, in acompletion_function
    raise AnthropicError(
litellm.llms.anthropic.common_utils.AnthropicError: Event loop is closed

The error is that litellm caches the httpx clients, which themself store a reference to the "current" event loop:

https://github.com/BerriAI/litellm/blob/6b9b4696861f168e8d16c681c6d745ecea9bfcda/litellm/llms/custom_httpx/http_handler.py#L432-L433

If the event loop is closed by pytest-asyncio, those httpx clients still use the old event loop:

A simple

@pytest.fixture(autouse=True)
def _clear_httpx_clients() -> None:
    litellm.in_memory_llm_clients_cache.clear()

lets our tests pass by clearing all cached httpx clients.

Would it be possible for litellm to ship this fixture via a pytest plugin by default?

whitead commented 1 week ago

I think you can get @Kakadus's solution into the pyproject.toml file as follows:

[tool.pytest.ini_options]
asyncio_default_fixture_loop_scope = "session"

BerriAI / litellm