microsoft / autogen

A programming framework for agentic AI 🤖
https://microsoft.github.io/autogen/
Creative Commons Attribution 4.0 International
34.12k stars 4.93k forks source link

Potentially flaky test `test_timeout_preserves_kernel_state[EmbeddedIPythonCodeExecutor]` #1904

Closed jackgerrits closed 3 weeks ago

jackgerrits commented 8 months ago
============================= test session starts ==============================
platform darwin -- Python 3.11.8, pytest-8.0.2, pluggy-1.4.0
rootdir: /Users/runner/work/autogen/autogen
configfile: pyproject.toml
plugins: anyio-4.3.0
collected 346 items

test/agentchat/contrib/capabilities/test_context_handling.py .s..        [  1%]
test/agentchat/contrib/capabilities/test_teachable_agent.py ss           [  1%]
test/agentchat/contrib/test_agent_builder.py sssss                       [  3%]
test/agentchat/contrib/test_compressible_agent.py sssss                  [  4%]
test/agentchat/contrib/test_gpt_assistant.py sssssssss                   [  7%]
test/agentchat/contrib/test_img_utils.py ssssssssssssssssss              [ [12](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:13)%]
test/agentchat/contrib/test_llava.py ssss                                [ [13](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:14)%]
test/agentchat/contrib/test_lmm.py ssss                                  [ [14](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:15)%]
test/agentchat/contrib/test_qdrant_retrievechat.py sss                   [ [15](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:16)%]
test/agentchat/contrib/test_retrievechat.py ss                           [ [16](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:17)%]
test/agentchat/contrib/test_society_of_mind_agent.py ..ss                [ [17](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:18)%]
test/agentchat/contrib/test_web_surfer.py sss                            [ [18](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:19)%]
test/agentchat/test_agent_logging.py ss                                  [ 18%]
test/agentchat/test_agent_setup_with_use_docker_settings.py sssssss      [ 20%]
test/agentchat/test_agent_usage.py ss                                    [ 21%]
test/agentchat/test_assistant_agent.py ssss                              [ 22%]
test/agentchat/test_async.py ss                                          [ 23%]
test/agentchat/test_async_chats.py s                                     [ 23%]
test/agentchat/test_async_get_human_input.py ss                          [ 23%]
test/agentchat/test_cache_agent.py sss                                   [ 24%]
test/agentchat/test_chats.py .sssss                                      [ 26%]
test/agentchat/test_conversable_agent.py .s.s......s.s..s........sss.    [ 34%]
test/agentchat/test_function_and_tool_calling.py ..ss..ss                [ 36%]
test/agentchat/test_function_call.py s..ss                               [ 38%]
test/agentchat/test_function_call_groupchat.py sss.                      [ 39%]
test/agentchat/test_groupchat.py .....................                   [ 45%]
test/agentchat/test_human_input.py s                                     [ 45%]
test/agentchat/test_math_user_proxy_agent.py s.s..                       [ 47%]
test/agentchat/test_nested.py s                                          [ 47%]
test/agentchat/test_tool_calls.py sss.s                                  [ 49%]
test/cache/test_cache.py ....                                            [ 50%]
test/cache/test_disk_cache.py .....                                      [ 51%]
test/cache/test_redis_cache.py sssss                                     [ 53%]
test/coding/test_commandline_code_executor.py .....s......               [ 56%]
test/coding/test_embedded_ipython_code_executor.py ...................F. [ 62%]
..ss..                                                                   [ 64%]
test/coding/test_factory.py .                                            [ 64%]
test/coding/test_markdown_code_extractor.py .                            [ 65%]
test/oai/test_client.py sssssssss                                        [ 67%]
test/oai/test_client_stream.py ss.sssss                                  [ 69%]
test/oai/test_custom_client.py ....                                      [ 71%]
test/oai/test_utils.py .........                                         [ 73%]
test/test_browser_utils.py ss                                            [ 74%]
test/test_code_utils.py ..sss...................                         [ 81%]
test/test_function_utils.py ................s.                           [ 86%]
test/test_graph_utils.py ............                                    [ 89%]
test/test_logging.py ........                                            [ 92%]
test/test_notebook.py ssssss                                             [ 93%]
test/test_pydantic.py ...                                                [ 94%]
test/test_retrieve_utils.py ssssssssssss                                 [ 98%]
test/test_token_count.py ......                                          [100%]

=================================== FAILURES ===================================
_______ test_timeout_preserves_kernel_state[EmbeddedIPythonCodeExecutor] _______

cls = <class 'autogen.coding.jupyter.embedded_ipython_code_executor.EmbeddedIPythonCodeExecutor'>

    @pytest.mark.skipif(skip, reason=skip_reason)
    @pytest.mark.parametrize("cls", classes_to_test)
    def test_timeout_preserves_kernel_state(cls: Type[CodeExecutor]) -> None:
        executor = cls(timeout=1)
        code_blocks = [CodeBlock(code="x = 123", language="python")]
        code_result = executor.execute_code_blocks(code_blocks)
        assert code_result.exit_code == 0 and code_result.output.strip() == ""

        code_blocks = [CodeBlock(code="import time; time.sleep(10)", language="python")]
        code_result = executor.execute_code_blocks(code_blocks)
        assert code_result.exit_code != 0 and "Timeout" in code_result.output

        code_blocks = [CodeBlock(code="print(x)", language="python")]
        code_result = executor.execute_code_blocks(code_blocks)
>       assert code_result.exit_code == 0 and "123" in code_result.output
E       AssertionError: assert (0 == 0 and '123' in '')
E        +  where 0 = IPythonCodeResult(exit_code=0, output='', output_files=[]).exit_code
E        +  and   '' = IPythonCodeResult(exit_code=0, output='', output_files=[]).output

test/coding/test_embedded_ipython_code_executor.py:[19](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:20)2: AssertionError
=============================== warnings summary ===============================
test/agentchat/test_async.py:63
  /Users/runner/work/autogen/autogen/test/agentchat/test_async.py:63: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_async.py:105
  /Users/runner/work/autogen/autogen/test/agentchat/test_async.py:105: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_async_chats.py:16
  /Users/runner/work/autogen/autogen/test/agentchat/test_async_chats.py:16: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_async_get_human_input.py:24
  /Users/runner/work/autogen/autogen/test/agentchat/test_async_get_human_input.py:24: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_async_get_human_input.py:51
  /Users/runner/work/autogen/autogen/test/agentchat/test_async_get_human_input.py:51: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_conversable_agent.py:80
  /Users/runner/work/autogen/autogen/test/agentchat/test_conversable_agent.py:80: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_conversable_agent.py:174
  /Users/runner/work/autogen/autogen/test/agentchat/test_conversable_agent.py:174: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_conversable_agent.py:471
  /Users/runner/work/autogen/autogen/test/agentchat/test_conversable_agent.py:471: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_conversable_agent.py:488
  /Users/runner/work/autogen/autogen/test/agentchat/test_conversable_agent.py:488: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_conversable_agent.py:612
  /Users/runner/work/autogen/autogen/test/agentchat/test_conversable_agent.py:612: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_conversable_agent.py:993
  /Users/runner/work/autogen/autogen/test/agentchat/test_conversable_agent.py:993: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio()

test/agentchat/test_function_and_tool_calling.py:256
  /Users/runner/work/autogen/autogen/test/agentchat/test_function_and_tool_calling.py:256: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio()

test/agentchat/test_function_and_tool_calling.py:339
  /Users/runner/work/autogen/autogen/test/agentchat/test_function_and_tool_calling.py:339: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio()

test/agentchat/test_function_call.py:141
  /Users/runner/work/autogen/autogen/test/agentchat/test_function_call.py:141: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_function_call_groupchat.py:40
  /Users/runner/work/autogen/autogen/test/agentchat/test_function_call_groupchat.py:40: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

autogen/agentchat/contrib/math_user_proxy_agent.py:391
  /Users/runner/work/autogen/autogen/autogen/agentchat/contrib/math_user_proxy_agent.py:391: PydanticDeprecatedSince[20](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:21): Pydantic V1 style `@root_validator` validators are deprecated. You should migrate to Pydantic V2 style `@model_validator` validators, see the migration guide for more details. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.6/migration/
    @root_validator(skip_on_failure=True)

../../../../../Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pydantic/_internal/_config.py:272
  /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pydantic/_internal/_config.py:272: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.6/migration/
    warnings.warn(DEPRECATION_MESSAGE, DeprecationWarning)

test/agentchat/test_tool_calls.py:290
  /Users/runner/work/autogen/autogen/test/agentchat/test_tool_calls.py:290: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

../../../../../Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/jupyter_client/connect.py:[22](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:23)
  /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/jupyter_client/connect.py:22: DeprecationWarning: Jupyter is migrating its paths to use standard platformdirs
  given by the platformdirs library.  To remove this warning and
  see the appropriate new directories, set the environment variable
  `JUPYTER_PLATFORM_DIRS=1` and then run `jupyter --paths`.
  The use of platformdirs will be the default in `jupyter_core` v6
    from jupyter_core.paths import jupyter_data_dir, jupyter_runtime_dir, secure_write

test/test_function_utils.py:376
  /Users/runner/work/autogen/autogen/test/test_function_utils.py:376: PytestUnknownMarkWarning: Unknown pytest.mark.asyncio - is this a typo?  You can register custom marks to avoid this warning - for details, see https://docs.pytest.org/en/stable/how-to/mark.html
    @pytest.mark.asyncio

test/agentchat/test_chats.py::test_chat_messages_for_summary
test/agentchat/test_conversable_agent.py::test_register_for_execution
test/agentchat/test_conversable_agent.py::test_register_functions
test/agentchat/test_function_call.py::test_execute_function
test/agentchat/test_function_call_groupchat.py::test_no_function_map
test/agentchat/test_math_user_proxy_agent.py::test_execute_one_wolfram_query
test/agentchat/test_math_user_proxy_agent.py::test_generate_prompt
test/agentchat/test_tool_calls.py::test_multi_tool_call
  /Users/runner/work/autogen/autogen/autogen/agentchat/user_proxy_agent.py:83: UserWarning: Using None to signal a default code_execution_config is deprecated. Use {} to use default or False to disable code execution.
    super().__init__(

test/agentchat/test_conversable_agent.py: 5 warnings
test/agentchat/test_function_and_tool_calling.py: 4 warnings
test/agentchat/test_function_call.py: 1 warning
test/agentchat/test_tool_calls.py: 1 warning
test/test_function_utils.py: 1 warning
  /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/_pytest/python.py:183: PytestUnhandledCoroutineWarning: async def functions are not natively supported and have been skipped.
  You need to install a suitable plugin for your async framework, for example:
    - anyio
    - pytest-asyncio
    - pytest-tornasync
    - pytest-trio
    - pytest-twisted
    warnings.warn(PytestUnhandledCoroutineWarning(msg.format(nodeid)))

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED test/coding/test_embedded_ipython_code_executor.py::test_timeout_preserves_kernel_state[EmbeddedIPythonCodeExecutor] - AssertionError: assert (0 == 0 and '1[23](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:24)' in '')
 +  where 0 = IPythonCodeResult(exit_code=0, output='', output_files=[]).exit_code
 +  and   '' = IPythonCodeResult(exit_code=0, output='', output_files=[]).output
===== 1 failed, 185 passed, 160 skipped, [40](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:41) warnings in 103.94s (0:01:[43](https://github.com/microsoft/autogen/actions/runs/8191366959/job/22400443979?pr=1903#step:8:44)) ======
rysweet commented 3 weeks ago

do we intend to fix this @jackgerrits ?

jackgerrits commented 3 weeks ago

Closing as wontfix - if this is severe enough to revisit we can reopen