python / cpython

The Python programming language
https://www.python.org
Other
62.31k stars 29.93k forks source link

test.test_asyncio.test_server.TestServer2.test_abort_clients consistently fails on Linux 6.10.x #122136

Open hroncok opened 1 month ago

hroncok commented 1 month ago

Bug report

Bug description:

Hello, we run the testsuite of the optimized and debug builds of Python in Fedora CI. Since the addition in https://github.com/python/cpython/commit/415964417771946dcb7a163951913adf84644b6d the test has constantly failed like this on Fedora Rawhide / Fedora Linux 41 (the development version). It passes on Fedora 40 and 39.

======================================================================
FAIL: test_abort_clients (test.test_asyncio.test_server.TestServer2.test_abort_clients)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.13/unittest/async_case.py", line 93, in _callTestMethod
    if self._callMaybeAsync(method) is not None:
       ~~~~~~~~~~~~~~~~~~~~^^^^^^^^
  File "/usr/lib64/python3.13/unittest/async_case.py", line 115, in _callMaybeAsync
    return self._asyncioRunner.run(
           ~~~~~~~~~~~~~~~~~~~~~~~^
        func(*args, **kwargs),
        ^^^^^^^^^^^^^^^^^^^^^^
        context=self._asyncioTestContext,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib64/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/usr/lib64/python3.13/asyncio/base_events.py", line 721, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/usr/lib64/python3.13/test/test_asyncio/test_server.py", line 249, in test_abort_clients
    self.assertNotEqual(s_wr.transport.get_write_buffer_size(), 0)
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 0 == 0

----------------------------------------------------------------------

I was unable to reproduce this outside of Fedora CI. Perhaps this has to do with how the network is configured, no idea.

This is the output of python3.13 -m test.pythoninfo:

``` Python debug information ======================== CC.version: gcc (GCC) 14.1.1 20240701 (Red Hat 14.1.1-7) _decimal.__libmpdec_version__: 2.5.1 _testcapi.LONG_MAX: 9223372036854775807 _testcapi.PY_SSIZE_T_MAX: 9223372036854775807 _testcapi.Py_C_RECURSION_LIMIT: 10000 _testcapi.SIZEOF_TIME_T: 8 _testcapi.SIZEOF_WCHAR_T: 4 _testinternalcapi.SIZEOF_PYGC_HEAD: 16 _testinternalcapi.SIZEOF_PYOBJECT: 16 build.NDEBUG: ignore assertions (macro defined) build.Py_DEBUG: No (sys.gettotalrefcount() missing) build.Py_TRACE_REFS: No (sys.getobjects() missing) build.WITH_DOC_STRINGS: Yes build.WITH_DTRACE: Yes build.WITH_FREELISTS: Yes build.WITH_MIMALLOC: Yes build.WITH_PYMALLOC: Yes build.WITH_VALGRIND: Yes builtins.float.double_format: IEEE, little-endian builtins.float.float_format: IEEE, little-endian config[_config_init]: 2 config[_init_main]: True config[_install_importlib]: True config[_is_python_build]: False config[argv]: ['-m'] config[base_exec_prefix]: '/usr' config[base_executable]: '/usr/bin/python3.13' config[base_prefix]: '/usr' config[buffered_stdio]: True config[bytes_warning]: 0 config[check_hash_pycs_mode]: 'default' config[code_debug_ranges]: True config[configure_c_stdio]: True config[cpu_count]: -1 config[dev_mode]: False config[dump_refs]: False config[dump_refs_file]: None config[exec_prefix]: '/usr' config[executable]: '/usr/bin/python3.13' config[faulthandler]: False config[filesystem_encoding]: 'utf-8' config[filesystem_errors]: 'surrogateescape' config[hash_seed]: 0 config[home]: None config[import_time]: False config[inspect]: False config[install_signal_handlers]: True config[int_max_str_digits]: 4300 config[interactive]: False config[isolated]: False config[malloc_stats]: False config[module_search_paths]: ['/usr/lib64/python313.zip', '/usr/lib64/python3.13', '/usr/lib64/python3.13/lib-dynload'] config[module_search_paths_set]: True config[optimization_level]: 0 config[orig_argv]: ['python3.13', '-m', 'test.pythoninfo'] config[parse_argv]: True config[parser_debug]: False config[pathconfig_warnings]: True config[perf_profiling]: 0 config[platlibdir]: 'lib64' config[prefix]: '/usr' config[program_name]: 'python3.13' config[pycache_prefix]: None config[pythonpath_env]: None config[quiet]: False config[run_command]: None config[run_filename]: None config[run_module]: 'test.pythoninfo' config[safe_path]: False config[show_ref_count]: False config[site_import]: True config[skip_source_first_line]: False config[stdio_encoding]: 'utf-8' config[stdio_errors]: 'strict' config[stdlib_dir]: '/usr/lib64/python3.13' config[sys_path_0]: '/var/str/python/selftest' config[tracemalloc]: 0 config[use_environment]: True config[use_frozen_modules]: True config[use_hash_seed]: False config[user_site_directory]: True config[verbose]: 0 config[warn_default_encoding]: False config[warnoptions]: [] config[write_bytecode]: True config[xoptions]: [] curses.ncurses_version: curses.ncurses_version(major=6, minor=4, patch=20240127) datetime.datetime.now: 2024-07-22 16:05:29.258162 expat.EXPAT_VERSION: expat_2.6.2 fips.linux_crypto_fips_enabled: 0 fips.openssl_fips_mode: 0 gdb_version: GNU gdb (Fedora Linux) 14.2-14.fc41 gdbm.GDBM_VERSION: 1.23.0 global_config[Py_BytesWarningFlag]: 0 global_config[Py_DebugFlag]: 0 global_config[Py_DontWriteBytecodeFlag]: 0 global_config[Py_FileSystemDefaultEncodeErrors]: 'surrogateescape' global_config[Py_FileSystemDefaultEncoding]: 'utf-8' global_config[Py_FrozenFlag]: 0 global_config[Py_HasFileSystemDefaultEncoding]: 0 global_config[Py_HashRandomizationFlag]: 1 global_config[Py_IgnoreEnvironmentFlag]: 0 global_config[Py_InspectFlag]: 0 global_config[Py_InteractiveFlag]: 0 global_config[Py_IsolatedFlag]: 0 global_config[Py_NoSiteFlag]: 0 global_config[Py_NoUserSiteDirectory]: 0 global_config[Py_OptimizeFlag]: 0 global_config[Py_QuietFlag]: 0 global_config[Py_UTF8Mode]: 0 global_config[Py_UnbufferedStdioFlag]: 0 global_config[Py_VerboseFlag]: 0 global_config[_Py_HasFileSystemDefaultEncodeErrors]: 0 libregrtests.build_info: release shared LTO+PGO valgrind dtrace locale.getencoding: UTF-8 os.cpu_count: 2 os.environ[HOME]: /root os.environ[LANG]: en_US.UTF-8 os.environ[PATH]: /root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/var/str/python/selftest os.environ[SHELL]: /bin/bash os.environ[TERM]: xterm os.getcwd: /var/str/python/selftest os.getegid: 0 os.geteuid: 0 os.getgid: 0 os.getgrouplist: 0 os.getgroups: 0 os.getloadavg: (1.19873046875, 0.60791015625, 0.2412109375) os.getrandom: ready (initialized) os.getresgid: (0, 0, 0) os.getresuid: (0, 0, 0) os.getuid: 0 os.login: root os.name: posix os.process_cpu_count: 2 os.supports_bytes_environ: True os.supports_effective_ids: ['access'] os.supports_fd: ['chdir', 'chmod', 'chown', 'execve', 'listdir', 'pathconf', 'scandir', 'stat', 'statvfs', 'truncate', 'utime'] os.supports_follow_symlinks: ['access', 'chown', 'link', 'stat', 'utime'] os.umask: 0o022 os.uname: posix.uname_result(sysname='Linux', nodename='ip-172-31-17-184.us-east-2.compute.internal', release='6.11.0-0.rc0.20240719git720261cfc732.7.fc41.x86_64', version='#1 SMP PREEMPT_DYNAMIC Fri Jul 19 16:59:06 UTC 2024', machine='x86_64') platform.architecture: 64bit ELF platform.freedesktop_os_release[ID]: fedora platform.freedesktop_os_release[NAME]: Fedora Linux platform.freedesktop_os_release[VARIANT_ID]: cloud platform.freedesktop_os_release[VERSION]: 41 (Cloud Edition Prerelease) platform.freedesktop_os_release[VERSION_ID]: 41 platform.libc_ver: glibc 2.39.9000 platform.platform: Linux-6.11.0-0.rc0.20240719git720261cfc732.7.fc41.x86_64-x86_64-with-glibc2.39.9000 platform.python_implementation: CPython pre_config[_config_init]: 2 pre_config[allocator]: 0 pre_config[coerce_c_locale]: 0 pre_config[coerce_c_locale_warn]: 0 pre_config[configure_locale]: 1 pre_config[dev_mode]: 0 pre_config[isolated]: 0 pre_config[parse_argv]: 1 pre_config[use_environment]: 1 pre_config[utf8_mode]: 0 pwd.getpwuid(0): pwd.struct_passwd(pw_name='root', pw_passwd='x', pw_uid=0, pw_gid=0, pw_gecos='Super User', pw_dir='/root', pw_shell='/bin/bash') pymem.allocator: pymalloc readline._READLINE_LIBRARY_VERSION: 8.2 readline._READLINE_RUNTIME_VERSION: 0x802 readline._READLINE_VERSION: 0x802 resource.RLIMIT_AS: (-1, -1) resource.RLIMIT_CORE: (-1, -1) resource.RLIMIT_CPU: (-1, -1) resource.RLIMIT_DATA: (-1, -1) resource.RLIMIT_FSIZE: (-1, -1) resource.RLIMIT_MEMLOCK: (8388608, 8388608) resource.RLIMIT_MSGQUEUE: (819200, 819200) resource.RLIMIT_NICE: (0, 0) resource.RLIMIT_NOFILE: (1024, 524288) resource.RLIMIT_NPROC: (15042, 15042) resource.RLIMIT_OFILE: (1024, 524288) resource.RLIMIT_RSS: (-1, -1) resource.RLIMIT_RTPRIO: (0, 0) resource.RLIMIT_RTTIME: (-1, -1) resource.RLIMIT_SIGPENDING: (15042, 15042) resource.RLIMIT_STACK: (8388608, -1) resource.pagesize: 4096 socket.hostname: ip-172-31-17-184.us-east-2.compute.internal sqlite3.sqlite_version: 3.46.0 ssl.HAS_SNI: True ssl.OPENSSL_VERSION: OpenSSL 3.2.2 4 Jun 2024 ssl.OPENSSL_VERSION_INFO: (3, 2, 0, 2, 0) ssl.OP_ALL: 0x80000050 ssl.OP_NO_TLSv1_1: 0x10000000 ssl.SSLContext.maximum_version: -1 ssl.SSLContext.minimum_version: -2 ssl.SSLContext.options: 2186412112 ssl.SSLContext.protocol: 16 ssl.SSLContext.verify_mode: 2 ssl.default_https_context.maximum_version: -1 ssl.default_https_context.minimum_version: -2 ssl.default_https_context.options: 2186412112 ssl.default_https_context.protocol: 16 ssl.default_https_context.verify_mode: 2 ssl.environ[OPENSSL_CONF]: /non-existing-file ssl.stdlib_context.maximum_version: -1 ssl.stdlib_context.minimum_version: -2 ssl.stdlib_context.options: 2186412112 ssl.stdlib_context.protocol: 16 ssl.stdlib_context.verify_mode: 0 subprocess._USE_POSIX_SPAWN: True support.MS_WINDOWS: False support._is_gui_available: False support.check_sanitizer(address=True): False support.check_sanitizer(memory=True): False support.check_sanitizer(ub=True): False support.has_fork_support: True support.has_socket_support: True support.has_strftime_extensions: True support.has_subprocess_support: True support.is_android: False support.is_emscripten: False support.is_jython: False support.is_wasi: False support.python_is_optimized: True support_os_helper.can_chmod: True support_os_helper.can_dac_override: True support_os_helper.can_symlink: True support_os_helper.can_xattr: True support_socket_helper.IPV6_ENABLED: True support_socket_helper.has_gethostname: True support_socket_helper.tcp_blackhole: False support_threading_helper.can_start_thread: True sys._is_gil_enabled: True sys.api_version: 1013 sys.builtin_module_names: ('_abc', '_ast', '_codecs', '_collections', '_functools', '_imp', '_io', '_locale', '_operator', '_signal', '_sre', '_stat', '_string', '_suggestions', '_symtable', '_sysconfig', '_thread', '_tokenize', '_tracemalloc', '_typing', '_warnings', '_weakref', 'atexit', 'builtins', 'errno', 'faulthandler', 'gc', 'itertools', 'marshal', 'posix', 'pwd', 'sys', 'time') sys.byteorder: little sys.dont_write_bytecode: False sys.executable: /usr/bin/python3.13 sys.filesystem_encoding: utf-8/surrogateescape sys.flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=0, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0, hash_randomization=1, isolated=0, dev_mode=False, utf8_mode=0, warn_default_encoding=0, safe_path=False, int_max_str_digits=4300) sys.float_info: sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1) sys.float_repr_style: short sys.getrecursionlimit: 1000 sys.hash_info: sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash13', hash_bits=64, seed_bits=128, cutoff=0) sys.hexversion: 51183796 sys.implementation: namespace(name='cpython', cache_tag='cpython-313', version=sys.version_info(major=3, minor=13, micro=0, releaselevel='beta', serial=4), hexversion=51183796, _multiarch='x86_64-linux-gnu') sys.int_info: sys.int_info(bits_per_digit=30, sizeof_digit=4, default_max_str_digits=4300, str_digits_check_threshold=640) sys.maxsize: 9223372036854775807 sys.maxunicode: 1114111 sys.path: ['/var/str/python/selftest', '/usr/lib64/python313.zip', '/usr/lib64/python3.13', '/usr/lib64/python3.13/lib-dynload', '/usr/lib64/python3.13/site-packages', '/usr/lib/python3.13/site-packages'] sys.platform: linux sys.platlibdir: lib64 sys.prefix: /usr sys.stderr.encoding: utf-8/backslashreplace sys.stdin.encoding: utf-8/strict sys.stdout.encoding: utf-8/strict sys.thread_info: sys.thread_info(name='pthread', lock='semaphore', version='NPTL 2.39.9000') sys.version: 3.13.0b4 (main, Jul 19 2024, 00:00:00) [GCC 14.1.1 20240701 (Red Hat 14.1.1-7)] sys.version_info: sys.version_info(major=3, minor=13, micro=0, releaselevel='beta', serial=4) sysconfig.is_python_build: False sysconfig[CCSHARED]: -fPIC sysconfig[CC]: gcc sysconfig[CFLAGSFORSHARED]: -fPIC sysconfig[CFLAGS]: -fno-strict-overflow -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -fcf-protection -fexceptions -fcf-protection -fexceptions -fcf-protection -fexceptions -O3 sysconfig[CONFIG_ARGS]: '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--program-prefix=' '--disable-dependency-tracking' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--localstatedir=/var' '--runstatedir=/run' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-platlibdir=lib64' '--enable-ipv6' '--enable-shared' '--with-computed-gotos=yes' '--with-dbmliborder=gdbm:ndbm:bdb' '--with-system-expat' '--with-system-ffi' '--with-system-libmpdec' '--enable-loadable-sqlite-extensions' '--with-dtrace' '--with-lto' '--with-ssl-default-suites=openssl' '--without-static-libpython' '--with-wheel-pkg-dir=/usr/share/python-wheels' '--with-valgrind' '--without-ensurepip' '--enable-experimental-jit=yes-off' '--enable-optimizations' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'PKG_CONFIG_PATH=:/usr/lib64/pkgconfig:/usr/share/pkgconfig' 'CC=gcc' 'CFLAGS=-fcf-protection -fexceptions ' 'LDFLAGS= ' 'CPPFLAGS=' sysconfig[HOST_GNU_TYPE]: x86_64-redhat-linux-gnu sysconfig[MACHDEP]: linux sysconfig[MULTIARCH]: x86_64-linux-gnu sysconfig[OPT]: -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -fcf-protection -fexceptions sysconfig[PGO_PROF_USE_FLAG]: -fprofile-use -fprofile-correction sysconfig[PY_CFLAGS]: -fno-strict-overflow -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -fcf-protection -fexceptions -fcf-protection -fexceptions -fcf-protection -fexceptions -O3 sysconfig[PY_CFLAGS_NODIST]: -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wno-complain-wrong-lang -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -D_GNU_SOURCE -fPIC -fwrapv -D_Py_TIER2=3 -D_Py_JIT -fno-semantic-interposition -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g -std=c11 -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wstrict-prototypes -Werror=implicit-function-declaration -fvisibility=hidden -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wno-complain-wrong-lang -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -D_GNU_SOURCE -fPIC -fwrapv -O3 -fprofile-use -fprofile-correction -I/builddir/build/BUILD/python3.13-3.13.0_b4-build/Python-3.13.0b4/Include/internal -I/builddir/build/BUILD/python3.13-3.13.0_b4-build/Python-3.13.0b4/Include/internal/mimalloc sysconfig[PY_CORE_LDFLAGS]: -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -g -fno-semantic-interposition -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -g sysconfig[PY_LDFLAGS_NODIST]: -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -g -fno-semantic-interposition -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g -Wl,-z,relro -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes -g sysconfig[PY_STDMODULE_CFLAGS]: -fno-strict-overflow -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -fcf-protection -fexceptions -fcf-protection -fexceptions -fcf-protection -fexceptions -O3 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wno-complain-wrong-lang -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -D_GNU_SOURCE -fPIC -fwrapv -D_Py_TIER2=3 -D_Py_JIT -fno-semantic-interposition -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g -std=c11 -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wstrict-prototypes -Werror=implicit-function-declaration -fvisibility=hidden -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wno-complain-wrong-lang -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -mtls-dialect=gnu2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -D_GNU_SOURCE -fPIC -fwrapv -O3 -fprofile-use -fprofile-correction -I/builddir/build/BUILD/python3.13-3.13.0_b4-build/Python-3.13.0b4/Include/internal -I/builddir/build/BUILD/python3.13-3.13.0_b4-build/Python-3.13.0b4/Include/internal/mimalloc -IObjects -IInclude -IPython -I. -I/builddir/build/BUILD/python3.13-3.13.0_b4-build/Python-3.13.0b4/Include -fPIC sysconfig[Py_DEBUG]: 0 sysconfig[Py_ENABLE_SHARED]: 1 sysconfig[Py_GIL_DISABLED]: 0 sysconfig[SHELL]: /bin/sh -e sysconfig[SOABI]: cpython-313-x86_64-linux-gnu sysconfig[TEST_MODULES]: yes sysconfig[abs_builddir]: /builddir/build/BUILD/python3.13-3.13.0_b4-build/Python-3.13.0b4/build/optimized sysconfig[abs_srcdir]: /builddir/build/BUILD/python3.13-3.13.0_b4-build/Python-3.13.0b4 sysconfig[prefix]: /usr sysconfig[srcdir]: /usr/lib64/python3.13/config-3.13-x86_64-linux-gnu tempfile.gettempdir: /tmp test_socket.HAVE_SOCKET_ALG: True test_socket.HAVE_SOCKET_BLUETOOTH: False test_socket.HAVE_SOCKET_CAN: False test_socket.HAVE_SOCKET_CAN_ISOTP: False test_socket.HAVE_SOCKET_CAN_J1939: False test_socket.HAVE_SOCKET_HYPERV: False test_socket.HAVE_SOCKET_QIPCRTR: True test_socket.HAVE_SOCKET_RDS: False test_socket.HAVE_SOCKET_UDPLITE: True test_socket.HAVE_SOCKET_VSOCK: True time.altzone: 0 time.daylight: 0 time.get_clock_info(monotonic): namespace(implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, adjustable=False, resolution=1e-09) time.get_clock_info(perf_counter): namespace(implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, adjustable=False, resolution=1e-09) time.get_clock_info(process_time): namespace(implementation='clock_gettime(CLOCK_PROCESS_CPUTIME_ID)', monotonic=True, adjustable=False, resolution=1e-09) time.get_clock_info(thread_time): namespace(implementation='clock_gettime(CLOCK_THREAD_CPUTIME_ID)', monotonic=True, adjustable=False, resolution=1e-09) time.get_clock_info(time): namespace(implementation='clock_gettime(CLOCK_REALTIME)', monotonic=False, adjustable=True, resolution=1e-09) time.time: 1721664329.3427575 time.timezone: 0 time.tzname: ('UTC', 'UTC') tkinter.TCL_VERSION: 8.6 tkinter.TK_VERSION: 8.6 tkinter.info_patchlevel: 8.6.14 zlib.ZLIB_RUNTIME_VERSION: 1.3.1.zlib-ng zlib.ZLIB_VERSION: 1.3.1.zlib-ng ```

We invoke the installed tests like this:

python3.13 -m test -wW -j0 -i test_check_probes

I'd like to debug this and see if something is wrong with the test or perhaps in Fedora 41. But I don't know where to start.

CPython versions tested on:

3.13

Operating systems tested on:

Linux

Linked PRs

hroncok commented 1 month ago

I wanted to cc @CendioOssman in the report but forgot.

CendioOssman commented 1 month ago

What the test tries to do is to fill the kernel socket buffers so that select() stops signalling that a send() is possible.

What we do to achieve this is set an explicit send and receive buffer size, to force the kernel to stop dynamically resizing them. We then ask the kernel what the real size is (since it will do funky stuff like double the requested size), and then write that much data in the buffer. The assertNotEqual() then checks that we've been forced to do some local buffering, as the kernel is full. If this hasn't happened, the test would otherwise give a false positive.

No idea why this isn't working in Fedora's CI. And it's difficult to debug if it only happens there. :/

Perhaps you do some creative fiddling with the network settings that forces dynamic buffer scaling to stay on?

Or is there some virtualization/containerization that results in the system lying to us?

CendioOssman commented 1 month ago

To debug, could you add some output of getsockopt(SO_SNDBUF), getsockopt(SO_RCVBUF), ioctl(SIOCINQ), and ioctl(SIOCOUTQ)?

hroncok commented 1 month ago

Thank you! The system is 100% virtualized/containerized. It runs on the Tetsing Farm and I don't know much about it, but perhaps @thrix might know what is happening or how to easily reproduce this locally.

hroncok commented 1 month ago

To debug, could you add some output of getsockopt(SO_SNDBUF), getsockopt(SO_RCVBUF), ioctl(SIOCINQ), and ioctl(SIOCOUTQ)?

Is this C? My Python has no socket.getsockopt, even thou it is documented at https://docs.python.org/3/library/socket.html#socket.socket.getsockopt

hroncok commented 1 month ago

Oh, I have getsockopt on socket objects, but it takes 2 arguments (level and option).

hroncok commented 1 month ago

So I can do something like his and run it on the CI (outputs from my machine):

>>> import socket
>>> s = socket.socket()
>>> s.getsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF)
16384
>>> s.getsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF)
131072

But for ioctl, I am out of my depth. I've tried something like fcntl.ioctl(s, termios.TIOCOUTQ) (as TIOCOUTQ is SIOCOUTQ in my /usr/include/linux/sockios.h), but I get OSError: [Errno 14] Bad address.

mcepl commented 1 month ago

We seem to have exactly the same here when packaging 3.13.0rc1 on openSUSE/Tumbleweed, build log. Is it possible that the chrooted environment (which is what both us and Fedora have) misses something expected by this test?

mgorny commented 1 month ago

I'm also seeing this sometimes when testing on Gentoo amd64, in systemd-nspawn container. However, the test is only flaky here — generally it fails when the system is busy, but passes when Python's test suite has all the CPU to itself.

hroncok commented 2 weeks ago

This now started failing in the Fedora build system as well.

hroncok commented 2 weeks ago

This now started failing in the Fedora build system as well.

Presumably when our builders were kernel-updated to 6.10.4 and/or 6.10.5.

hroncok commented 2 weeks ago

Indeed. I just rebooted from kernel 6.8.9 to 6.10.6 and I can reproduce the failure locally on Fedora Linux 39:

$ uname -a
Linux carbon 6.10.6-100.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Aug 19 14:35:32 UTC 2024 x86_64 GNU/Linux

$ python3.13 -m test test_asyncio -wW -j8
Using random seed: 1988926982
0:00:00 load avg: 0.41 Run 31 tests in parallel using 8 worker processes
0:00:00 load avg: 0.41 [ 1/31] test_asyncio.test_buffered_proto passed
0:00:00 load avg: 0.41 [ 2/31] test_asyncio.test_futures2 passed
0:00:00 load avg: 0.41 [ 3/31] test_asyncio.test_context passed
0:00:00 load avg: 0.41 [ 4/31] test_asyncio.test_pep492 passed
0:00:00 load avg: 0.41 [ 5/31] test_asyncio.test_protocols passed
0:00:00 load avg: 0.41 [ 6/31] test_asyncio.test_proactor_events passed
0:00:00 load avg: 1.02 [ 7/31] test_asyncio.test_selector_events passed
0:00:00 load avg: 1.02 [ 8/31] test_asyncio.test_queues passed
0:00:00 load avg: 1.02 [ 9/31] test_asyncio.test_eager_task_factory passed
0:00:00 load avg: 1.02 [10/31] test_asyncio.test_runners passed
0:00:01 load avg: 1.02 [11/31] test_asyncio.test_locks passed
0:00:01 load avg: 1.02 [12/31] test_asyncio.test_base_events passed
0:00:01 load avg: 1.02 [13/31/1] test_asyncio.test_server failed (1 failure)
test_start_server_1 (test.test_asyncio.test_server.ProactorStartServerTests.test_start_server_1) ... skipped 'Windows only'
test_start_server_1 (test.test_asyncio.test_server.SelectorStartServerTests.test_start_server_1) ... ok
test_start_unix_server_1 (test.test_asyncio.test_server.SelectorStartServerTests.test_start_unix_server_1) ... ok
test_abort_clients (test.test_asyncio.test_server.TestServer2.test_abort_clients) ... FAIL
test_close_clients (test.test_asyncio.test_server.TestServer2.test_close_clients) ... ok
test_wait_closed_basic (test.test_asyncio.test_server.TestServer2.test_wait_closed_basic) ... ok
test_wait_closed_race (test.test_asyncio.test_server.TestServer2.test_wait_closed_race) ... ok
test_unix_server_addr_cleanup (test.test_asyncio.test_server.UnixServerCleanupTests.test_unix_server_addr_cleanup) ... ok
test_unix_server_cleanup_gone (test.test_asyncio.test_server.UnixServerCleanupTests.test_unix_server_cleanup_gone) ... ok
test_unix_server_cleanup_prevented (test.test_asyncio.test_server.UnixServerCleanupTests.test_unix_server_cleanup_prevented) ... ok
test_unix_server_cleanup_replaced (test.test_asyncio.test_server.UnixServerCleanupTests.test_unix_server_cleanup_replaced) ... ok
test_unix_server_sock_cleanup (test.test_asyncio.test_server.UnixServerCleanupTests.test_unix_server_sock_cleanup) ... ok

======================================================================
FAIL: test_abort_clients (test.test_asyncio.test_server.TestServer2.test_abort_clients)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.13/unittest/async_case.py", line 93, in _callTestMethod
    if self._callMaybeAsync(method) is not None:
       ~~~~~~~~~~~~~~~~~~~~^^^^^^^^
  File "/usr/lib64/python3.13/unittest/async_case.py", line 115, in _callMaybeAsync
    return self._asyncioRunner.run(
           ~~~~~~~~~~~~~~~~~~~~~~~^
        func(*args, **kwargs),
        ^^^^^^^^^^^^^^^^^^^^^^
        context=self._asyncioTestContext,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib64/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/usr/lib64/python3.13/asyncio/base_events.py", line 721, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/usr/lib64/python3.13/test/test_asyncio/test_server.py", line 249, in test_abort_clients
    self.assertNotEqual(s_wr.transport.get_write_buffer_size(), 0)
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 0 == 0

----------------------------------------------------------------------
Ran 12 tests in 0.244s

FAILED (failures=1, skipped=1)
test test_asyncio.test_server failed
0:00:01 load avg: 1.02 [14/31/1] test_asyncio.test_futures passed
0:00:01 load avg: 1.02 [15/31/1] test_asyncio.test_staggered passed
0:00:01 load avg: 1.02 [16/31/1] test_asyncio.test_sendfile passed
0:00:02 load avg: 1.02 [17/31/1] test_asyncio.test_streams passed
0:00:02 load avg: 1.02 [18/31/1] test_asyncio.test_threads passed
0:00:02 load avg: 1.02 [19/31/1] test_asyncio.test_timeouts passed
0:00:02 load avg: 1.02 [20/31/1] test_asyncio.test_transports passed
0:00:02 load avg: 1.02 [21/31/1] test_asyncio.test_sock_lowlevel passed
0:00:03 load avg: 1.02 [22/31/1] test_asyncio.test_unix_events passed
0:00:03 load avg: 1.02 [23/31/1] test_asyncio.test_windows_events skipped
test_asyncio.test_windows_events skipped -- Windows only
0:00:03 load avg: 1.02 [24/31/1] test_asyncio.test_windows_utils skipped
test_asyncio.test_windows_utils skipped -- Windows only
0:00:03 load avg: 1.02 [25/31/1] test_asyncio.test_waitfor passed
0:00:05 load avg: 1.02 [26/31/1] test_asyncio.test_sslproto passed
0:00:06 load avg: 1.10 [27/31/1] test_asyncio.test_events passed
0:00:07 load avg: 1.10 [28/31/1] test_asyncio.test_taskgroups passed
0:00:08 load avg: 1.10 [29/31/1] test_asyncio.test_ssl passed
0:00:16 load avg: 2.75 [30/31/1] test_asyncio.test_subprocess passed
0:00:18 load avg: 2.75 [31/31/1] test_asyncio.test_tasks passed

== Tests result: FAILURE ==

2 tests skipped:
    test_asyncio.test_windows_events test_asyncio.test_windows_utils

1 test failed:
    test_asyncio.test_server

28 tests OK.

0:00:18 load avg: 2.75 Re-running 1 failed tests in verbose mode in subprocesses
0:00:18 load avg: 2.75 Run 1 test in parallel using 1 worker process
0:00:18 load avg: 2.75 [1/1/1] test_asyncio.test_server failed (1 failure)
Re-running test_asyncio.test_server in verbose mode (matching: test_abort_clients)
test_abort_clients (test.test_asyncio.test_server.TestServer2.test_abort_clients) ... FAIL

======================================================================
FAIL: test_abort_clients (test.test_asyncio.test_server.TestServer2.test_abort_clients)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.13/unittest/async_case.py", line 93, in _callTestMethod
    if self._callMaybeAsync(method) is not None:
       ~~~~~~~~~~~~~~~~~~~~^^^^^^^^
  File "/usr/lib64/python3.13/unittest/async_case.py", line 115, in _callMaybeAsync
    return self._asyncioRunner.run(
           ~~~~~~~~~~~~~~~~~~~~~~~^
        func(*args, **kwargs),
        ^^^^^^^^^^^^^^^^^^^^^^
        context=self._asyncioTestContext,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/usr/lib64/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/usr/lib64/python3.13/asyncio/base_events.py", line 721, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/usr/lib64/python3.13/test/test_asyncio/test_server.py", line 249, in test_abort_clients
    self.assertNotEqual(s_wr.transport.get_write_buffer_size(), 0)
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 0 == 0

----------------------------------------------------------------------
Ran 1 test in 0.017s

FAILED (failures=1)
test test_asyncio.test_server failed
1 test failed again:
    test_asyncio.test_server

== Tests result: FAILURE then FAILURE ==

2 tests skipped:
    test_asyncio.test_windows_events test_asyncio.test_windows_utils

1 re-run test:
    test_asyncio.test_server

1 test failed:
    test_asyncio.test_server

28 tests OK.

Total duration: 18.5 sec
Total tests: run=2,561 failures=2 skipped=18
Total test files: run=32/31 failed=1 skipped=2 rerun=1
Result: FAILURE then FAILURE

(I could not reproduce the failure with Linux 6.8.9.)

encukou commented 2 weeks ago

From man socket:

SO_RCVBUF Sets or gets the maximum socket receive buffer in bytes. The kernel doubles this value (to allow space for bookkeeping overhead) when it is set using setsockopt(2), and this doubled value is returned by getsockopt(2). The default value is set by the /proc/sys/net/core/rmem_default file, and the maximum allowed value is set by the /proc/sys/net/core/rmem_max file. The minimum (doubled) value for this option is 256.

So, the buffer size reported by getsockopt is twice the size requested by setsockopt, but the actual buffer size is less (for the “bookkeeping overhead”).

Testing with the Python socket module:

import os
import socket

print(os.uname().sysname, os.uname().release)

HOST = socket.gethostname()
PORT = 12345

l_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
l_sock.bind((HOST, PORT))
l_sock.listen(5)

c_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
c_sock.connect((HOST, PORT))

s_sock, c_addr = l_sock.accept()

BUFSIZE_REQUEST = 65536

s_sock.setblocking(False)
c_sock.setblocking(False)

s_sock.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, BUFSIZE_REQUEST)
c_sock.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, BUFSIZE_REQUEST)

bufsize_s = s_sock.getsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF)
bufsize_c = c_sock.getsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF)

send_size = min(bufsize_s, bufsize_c)
print(f'{BUFSIZE_REQUEST=} {bufsize_s=} {bufsize_c=}')

for i in range(50):
    try:
        sent = c_sock.send(b'a' * send_size)
    except BlockingIOError:
        sent = None
    print(f'iteration {i}: {sent=}/{send_size} bytes')
    if sent is None:
        break

c_sock.close()
s_sock.close()
l_sock.close()

This has different behaviour between the kernel versions. With kernel 6.5.x, the buffer fills up with a little more than the buffer size reported by getsockopt, as one would expect from what the manpage says:

Linux 6.5.6-300.fc39.x86_64
BUFSIZE_REQUEST=65536 bufsize_s=131072 bufsize_c=131072
iteration 0: sent=130964/131072 bytes
iteration 1: sent=32741/131072 bytes
iteration 2: sent=None/131072 bytes

But with 6.10.6, the buffers can hold almost five times the reported buffer size:

Linux 6.10.6-200.fc40.x86_64
BUFSIZE_REQUEST=65536 bufsize_s=131072 bufsize_c=131072
iteration 0: sent=131072/131072 bytes
iteration 1: sent=131072/131072 bytes
iteration 2: sent=131072/131072 bytes
iteration 3: sent=131072/131072 bytes
iteration 4: sent=130964/131072 bytes
iteration 5: sent=None/131072 bytes

The new kernel is being quite generous! I didn't find documentation/commit for this change.

I guess the way to reliably fill the buffers is to keep sending until the kernel doesn't accept any more data, rather than rely on the numbers...

encukou commented 2 weeks ago

Now the buildbots are fixed, but I'd like to try to find some more info about what's happening so I'll keep the issue open.

mcepl commented 1 week ago

Actually, I see suddenly all openSUSE builds on PPC64LE failing because of test_sendfile_close_peer_in_the_middle_of_receiving in three classes of test.test_asyncio.test_sendfile.py. Given that it has a long comment about failing on Solaris because of problems with socket buffers, I wonder, whether it is not related.

Complete build log of Python 3.13.0rc1

Or perhaps it is another revelation of https://github.com/python/cpython/issues/85848 (aka https://github.com/python/cpython/issues/110325)?

hroncok commented 1 week ago

Actually, I see suddenly all openSUSE builds on PPC64LE failing because of test_sendfile_close_peer_in_the_middle_of_receiving in three classes of test.test_asyncio.test_sendfile.py...

https://github.com/python/cpython/issues/120226

vstinner commented 1 week ago

Python 3.12 is not affected: the test was added to Python 3.13.