Python: Bug: Enabling Asynchronous Filter in Azure OpenAI results in “AttributeError: ‘NoneType’ object has no attribute ‘tool_calls’”

yuichiromukaiyama commented 1 month ago

Describe the bug Azure OpenAI has a content filter. The traditional Azure content filter buffers chunks for a certain period, causing a bit of a sluggish behavior compared to the OpenAI API. Recently, an Asynchronous Filter was released to enable smoother returns. However, when this feature is enabled, the following error occurs.

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/app/__main__.py", line 58, in <module>
    asyncio.run(main())
  File "~/.pyenv/versions/3.11.7/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "~/.pyenv/versions/3.11.7/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.pyenv/versions/3.11.7/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/app/__main__.py", line 50, in main
    async for chunk in stream:
  File "/.venv/lib/python3.11/site-packages/semantic_kernel/connectors/ai/open_ai/services/open_ai_chat_completion_base.py", line 200, in get_streaming_chat_message_contents
    async for messages in self._send_chat_stream_request(settings):
  File "/.venv/lib/python3.11/site-packages/semantic_kernel/connectors/ai/open_ai/services/open_ai_chat_completion_base.py", line 287, in _send_chat_stream_request
    yield [
          ^
  File "/.venv/lib/python3.11/site-packages/semantic_kernel/connectors/ai/open_ai/services/open_ai_chat_completion_base.py", line 288, in <listcomp>
    self._create_streaming_chat_message_content(chunk, choice, chunk_metadata) for choice in chunk.choices
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/semantic_kernel/connectors/ai/open_ai/services/azure_chat_completion.py", line 157, in _create_streaming_chat_message_content
    content = super()._create_streaming_chat_message_content(chunk, choice, chunk_metadata)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/semantic_kernel/connectors/ai/open_ai/services/open_ai_chat_completion_base.py", line 325, in _create_streaming_chat_message_content
    items: list[Any] = self._get_tool_calls_from_chat_choice(choice)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.11/site-packages/semantic_kernel/connectors/ai/open_ai/services/open_ai_chat_completion_base.py", line 366, in _get_tool_calls_from_chat_choice
    if content.tool_calls is None:
       ^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'tool_calls'

To Reproduce The steps to reproduce the issue are as follows: use the code below and enable the Asynchronous Filter on the Azure OpenAI side. Note that this error does not occur with api_version 2023-03-15-preview.

semantic-kernel Version: 1.1.2 model=Azure OpenAI japan-east gpt-35-turbo-16k version-0613 content filter=Enabled Asynchronous Filter api_version=2024-05-01-preview,2024-02-01,2023-12-01-preview,2023-09-01-preview,2023-08-01-preview,2023-07-01-preview

import asyncio
from semantic_kernel.contents.chat_history import ChatHistory
from typing import cast
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion, AzureChatPromptExecutionSettings
from semantic_kernel.connectors.ai.prompt_execution_settings import PromptExecutionSettings

async def main():

    service_id = "default"
    service = AzureChatCompletion(
        service_id=service_id,
        deployment_name=AZURE_OPENAI_DEPLOYMENT_NAME,
        endpoint=AZURE_OPENAI_ENDPOINT,
        api_key=AZURE_OPENAI_API_KEY,
        api_version="…",
    )

    # By the way, is there a better way to express this?⬤
    settings = cast(
        AzureChatPromptExecutionSettings,
        cast(
            type[PromptExecutionSettings],
            service.get_prompt_execution_settings_class(),
        )(service_id=service_id),
    )

    history = ChatHistory()
    history.add_user_message("hello")
    stream = service.get_streaming_chat_message_contents(history, settings)

    async for chunk in stream:
        if chunk[0] is not None:
            print(chunk[0], end="")

asyncio.run(main())

Expected behavior I expect the stream to run to completion without this error occurring.

Screenshots none

Platform

OS: Mac OS
IDE: VS Code
Language: Python
Source: pip semantic-kernel==1.1.2

Additional context

Comparing the streams obtained from Azure OpenAI, the last chunk differs as follows.

content filter = OFF

data: {"choices":[],"created":0,"id":"","model":"","object":"","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}]}

data: {"choices":[{"content_filter_results":{},"delta":{"content":"","role":"assistant"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":"Hello"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":"!"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":" How"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":" can"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":" I"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":" assist"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":" you"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":" today"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"delta":{"content":"?"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_results":{},"delta":{},"finish_reason":"stop","index":0,"logprobs":null}],"created":1720865367,"id":"chatcmpl-9kTyxSScdtj40efg4H99eb3C6jn7d","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: [DONE]

content filter = ON

data: {"choices":[],"created":0,"id":"","model":"","object":"","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}]}

data: {"choices":[{"delta":{"content":"","role":"assistant"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{"content":"Hello"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{"content":"!"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{"content":" How"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{"content":" can"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{"content":" I"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{"content":" assist"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{"content":" you"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{"content":" today"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{"content":"?"},"finish_reason":null,"index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"delta":{},"finish_reason":"stop","index":0,"logprobs":null}],"created":1720865526,"id":"chatcmpl-9kU1WV3PzywzxAmGJgSqybB68g1fs","model":"gpt-35-turbo-16k","object":"chat.completion.chunk","system_fingerprint":null}

data: {"choices":[{"content_filter_offsets":{"check_offset":35,"start_offset":35,"end_offset":69},"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"finish_reason":null,"index":0}],"created":0,"id":"","model":"","object":""}

data: [DONE]

Would it be possible to include cases where None is passed as an argument here?

semantic_kernel/connectors/ai/open_ai/services/open_ai_chat_completion_base.py

yuichiromukaiyama commented 1 month ago

I apologize for including multiple topics. Additionally, is there documentation on which API versions (Azure OpenAI) are supported by Semantic Kernel?

moonbox3 commented 1 month ago

Hi @yuichiromukaiyama, I see a few things here.

Firstly, I noticed a comment in your code around getting the execution settings. Instead of having to perform two casts, you can do something like:

kernel.get_service(service_id).get_prompt_execution_settings_class()(service_id=service_id)

or

req_settings = kernel.get_prompt_execution_settings_from_service_id(service_id=service_id)

You'll get the proper type back for the specified service_id. And you can provide the type hint like:

req_settings: AzureChatPromptExecutionSettings = kernel.get_prompt_execution_settings_from_service_id(...). Or even do an assert after to satisfy that it is of the expected/correct type.

Now onto your error: how is content filter=Enabled Asynchronous Filter getting configured? I am not seeing it as part of the code you show above.

I see that a community member created this PR (#7115) that is related to the issue (the code change then caused many tests to fail and there hasn't been follow up yet). I do want to bring this up to the AzureOpenAI team to make sure this is intended and it isn't a regression. Allow me to do so and I will respond here when I hear from them.

yuichiromukaiyama commented 1 month ago

req_settings = kernel.get_prompt_execution_settings_from_service_id(service_id=service_id)

Thank you! It worked well.

Now onto your error: how is content filter=Enabled Asynchronous Filter getting configured? I am not seeing it as part of the code you show above.

This is not something specified in the code. It is a setting on the Azure Portal side. https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/content-filter?tabs=warning%2Cuser-prompt%2Cpython-new#asynchronous-filter

The settings are configured through the web portal as follows. Azure OpenAI usually returns multiple chunks together. This setting is enabled to allow the chunks to be returned more in real-time.

I see that a community member created this PR (https://github.com/microsoft/semantic-kernel/pull/7115) that is related to the issue (the code change then caused many tests to fail and there hasn't been follow up yet). I do want to bring this up to the AzureOpenAI team to make sure this is intended and it isn't a regression. Allow me to do so and I will respond here when I hear from them.

Got it. Thank you!

moonbox3 commented 1 month ago

Hi @yuichiromukaiyama, I appreciate your response. I have reached out to a contact on the Azure OpenAI side, and they will be back in the office early next week. Once they investigate, I will respond. Thanks for your patience.

ymuichiro commented 2 weeks ago

pull request https://github.com/microsoft/semantic-kernel/pull/8075

yuichiromukaiyama commented 1 week ago

merge

pull request https://github.com/microsoft/semantic-kernel/pull/8075

microsoft / semantic-kernel

Python: Bug: Enabling Asynchronous Filter in Azure OpenAI results in “AttributeError: ‘NoneType’ object has no attribute ‘tool_calls’” #7250