python / cpython

The Python programming language
https://www.python.org
Other
63.62k stars 30.48k forks source link

Python 3.13: test_xml_etree fails on expat 2.2.5 (RHEL8 / Rocky linux 8) #125067

Open JacobHenner opened 1 month ago

JacobHenner commented 1 month ago

Bug report

Bug description:

I'm attempting to build Python 3.13.0 for Rocky Linux 8, and several tests are failing:

LD_LIBRARY_PATH=/usr/src/python ./python -m test --pgo --timeout=
Using random seed: 1341012311
0:00:00 load avg: 15.81 Run 44 tests sequentially in a single process
0:00:00 load avg: 15.81 [ 1/44] test_array
0:00:02 load avg: 14.70 [ 2/44] test_base64
0:00:02 load avg: 14.70 [ 3/44] test_binascii
0:00:03 load avg: 14.70 [ 4/44] test_binop
0:00:03 load avg: 14.70 [ 5/44] test_bisect
0:00:03 load avg: 14.70 [ 6/44] test_bytes
0:00:13 load avg: 12.83 [ 7/44] test_bz2
0:00:14 load avg: 12.83 [ 8/44] test_cmath
0:00:14 load avg: 12.83 [ 9/44] test_codecs
0:00:17 load avg: 12.04 [10/44] test_collections
0:00:21 load avg: 12.04 [11/44] test_complex
0:00:22 load avg: 11.32 [12/44] test_dataclasses
0:00:23 load avg: 11.32 [13/44] test_datetime
0:00:31 load avg: 10.57 [14/44] test_decimal
0:00:41 load avg: 9.25 [15/44] test_difflib
0:00:44 load avg: 8.67 [16/44] test_embed
0:01:01 load avg: 7.35 [17/44] test_float
0:01:01 load avg: 7.35 [18/44] test_fstring
0:01:06 load avg: 7.00 [19/44] test_functools
0:01:08 load avg: 6.68 [20/44] test_generators
test test_generators failed
0:01:08 load avg: 6.68 [21/44] test_hashlib -- test_generators failed (1 failure)
0:01:10 load avg: 6.68 [22/44] test_heapq
0:01:11 load avg: 6.68 [23/44] test_int
0:01:14 load avg: 6.47 [24/44] test_itertools
0:01:25 load avg: 6.16 [25/44] test_json
0:01:33 load avg: 5.52 [26/44] test_long
0:01:40 load avg: 5.31 [27/44] test_lzma
0:01:41 load avg: 5.31 [28/44] test_math
0:01:49 load avg: 5.04 [29/44] test_memoryview
0:01:49 load avg: 5.04 [30/44] test_operator
0:01:50 load avg: 5.04 [31/44] test_ordered_dict
0:01:52 load avg: 5.04 [32/44] test_patma
0:01:52 load avg: 4.79 [33/44] test_pickle
0:02:07 load avg: 4.10 [34/44] test_pprint
0:02:08 load avg: 4.10 [35/44] test_re
0:02:10 load avg: 4.10 [36/44] test_set
0:02:24 load avg: 4.68 [37/44] test_sqlite3
0:02:26 load avg: 4.68 [38/44] test_statistics
0:02:45 load avg: 6.26 [39/44] test_str
0:02:50 load avg: 6.32 [40/44] test_struct
0:02:52 load avg: 6.32 [41/44] test_tabnanny
0:02:53 load avg: 6.45 [42/44] test_time
0:02:55 load avg: 6.45 [43/44] test_xml_etree
test test_xml_etree failed
0:02:57 load avg: 6.45 [44/44] test_xml_etree_c -- test_xml_etree failed (3 failures)
test test_xml_etree_c failed
test_xml_etree_c failed (3 failures)
Total duration: 2 min 59 sec
Total tests: run=9,361 failures=7 skipped=204
Total test files: run=44/44 failed=3

These tests are required to succeed in order for --enable-optimizations to work as expected.

I am using the following build options:

./configure \
  --enable-loadable-sqlite-extensions \
  --enable-optimizations \
  --enable-shared \
  --with-lto \
  --with-system-expat \
  --with-ensurepip \
  LDFLAGS=-Wl,-rpath=/usr/local/lib

3.9-3.12.6 do not exhibit this behavior when built with the same commands (aside from the Python version).

I'm curious whether there might be a regression for certain versions of expat, similar to https://github.com/python/cpython/issues/117187.

expat-devel version is expat-2.2.5-15.el8.

The specific test failures are:

======================================================================
FAIL: test_flush_reparse_deferral_disabled (test.test_xml_etree.XMLPullParserTest.test_flush_reparse_deferral_disabled)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/src/python/Lib/test/test_xml_etree.py", line 1752, in test_flush_reparse_deferral_disabled
    self.assert_event_tags(parser, [('start', 'doc')])
    ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/python/Lib/test/test_xml_etree.py", line 1483, in assert_event_tags
    self.assertEqual([(action, elem.tag) for action, elem in events],
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                     expected)
                     ^^^^^^^^^
AssertionError: Lists differ: [] != [('start', 'doc')]

Second list contains 1 additional elements.
First extra element 0:
('start', 'doc')

- []
+ [('start', 'doc')]

======================================================================
FAIL: test_simple_xml_chunk_1 (test.test_xml_etree.XMLPullParserTest.test_simple_xml_chunk_1)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/src/python/Lib/test/test_xml_etree.py", line 1508, in test_simple_xml_chunk_1
    self.test_simple_xml(chunk_size=1, flush=True)
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/python/Lib/test/test_xml_etree.py", line 1496, in test_simple_xml
    self.assert_event_tags(parser, [('end', 'element')])
    ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/python/Lib/test/test_xml_etree.py", line 1483, in assert_event_tags
    self.assertEqual([(action, elem.tag) for action, elem in events],
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                     expected)
                     ^^^^^^^^^
AssertionError: Lists differ: [] != [('end', 'element')]

Second list contains 1 additional elements.
First extra element 0:
('end', 'element')

- []
+ [('end', 'element')]

======================================================================
FAIL: test_simple_xml_chunk_5 (test.test_xml_etree.XMLPullParserTest.test_simple_xml_chunk_5)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/src/python/Lib/test/test_xml_etree.py", line 1511, in test_simple_xml_chunk_5
    self.test_simple_xml(chunk_size=5, flush=True)
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/python/Lib/test/test_xml_etree.py", line 1496, in test_simple_xml
    self.assert_event_tags(parser, [('end', 'element')])
    ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/python/Lib/test/test_xml_etree.py", line 1483, in assert_event_tags
    self.assertEqual([(action, elem.tag) for action, elem in events],
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                     expected)
                     ^^^^^^^^^
AssertionError: Lists differ: [] != [('end', 'element')]

Second list contains 1 additional elements.
First extra element 0:
('end', 'element')

- []
+ [('end', 'element')]

----------------------------------------------------------------------
Ran 198 tests in 0.423s

FAILED (failures=3, skipped=6)
test test_xml_etree failed
test_xml_etree failed (3 failures)

== Tests result: FAILURE ==

1 test failed:
    test_xml_etree

Total duration: 640 ms
Total tests: run=198 failures=3 skipped=6
Total test files: run=1/1 failed=1
Result: FAILURE

CPython versions tested on:

3.13

Operating systems tested on:

Linux

AA-Turner commented 1 month ago

Are you able to test with pre releases, or attempt git bisect to help us narrow down the issue?

A

JacobHenner commented 1 month ago

Are you able to test with pre releases, or attempt git bisect to help us narrow down the issue?

Yes - for the pre-releases, are there specific versions I should attempt?

AA-Turner commented 1 month ago

I would attempt a manual bisection -- start with eg eg beta 1, then iterate through depending on if the failures are present.

JacobHenner commented 1 month ago

I would attempt a manual bisection -- start with eg eg beta 1, then iterate through depending on if the failures are present.

I've reproduced the test_xml_etree error on all rc releases, and also a1. I'm testing some beta releases now.

Oddly, the test_hashlib error only occurs within our CI build environment, but not when building locally. The reason this is odd is because the same Dockerfile is used in both cases, with no known difference in build context that'd explain why it'd fail in one case and succeed in the other. I'll follow up on this one next.

JacobHenner commented 1 month ago

The failing tests (test_xml_etree*) were all touched in https://github.com/python/cpython/pull/115623/files#diff-cd2315ff25ab49e653d363de2d213e390ecb1b01bbc86e2be3b700db4b0bf454. Perhaps the changes in https://github.com/python/cpython/commit/9f74e86c78853c101a23e938f8e32ea838d8f62e, intended to address test failures for expat < 2.6, were incomplete?

AA-Turner commented 1 month ago

Have you tried with a more recent version of expat (2.6+)? Do the tests succeed?

JacobHenner commented 1 month ago

Have you tried with a more recent version of expat (2.6+)? Do the tests succeed?

Just did - test_xml_etree* succeeds with expat 2.6.3 built from source on Rocky 8, with all else remaining equal.

vkurilin commented 1 month ago

Oddly, the test_hashlib error only occurs within our CI build environment, but not when building locally.

+1 to @JacobHenner – I have the exact same issue with test_hashlib. For 3.13.0 in CI, tests are failing both on rhel and debian. However, for all versions from 3.8 to 3.12, the tests pass without errors. Locally, the Dockerfile builds without errors.

JacobHenner commented 1 month ago

Oddly, the test_hashlib error only occurs within our CI build environment, but not when building locally.

+1 to @JacobHenner – I have the exact same issue with test_hashlib. For 3.13.0 in CI, tests are failing both on rhel and debian. However, for all versions from 3.8 to 3.12, the tests pass without errors. Locally, the Dockerfile builds without errors.

What are you using to build your containers? Docker, podman, Kaniko?

vkurilin commented 1 month ago

I'm building with Kaniko. All versions from 3.13.0a1 to 3.13.0rc3 are failing.

However, there is an important difference from your case: only test_hashlib fails. test_xml_etree_c works fine.

JacobHenner commented 1 month ago

I'm building with Kaniko.

I suspect this is related. We're also building with Kaniko in the cases where it's failing. I'm analyzing this further now.

However, there is an important difference from your case: only test_hashlib fails. test_xml_etree_c works fine.

Which distro are you using, and with which version of expat?

isuftin commented 1 month ago

We're seeing test_set fail in Docker building on Alpine 3.19 w/ expat 2.6.3

test_memoryview is also failing

JacobHenner commented 1 month ago

I think my previous comments about test_hashlib were misguided because of how the error lines are displayed. The actual error appears to be in test_generators (FAIL: test_raise_and_yield_from (test.test_generators.SignalAndYieldFromTest.test_raise_and_yield_from)). I wonder if this is Kaniko-specific, however Kaniko builds of 3.9-3.12 do not exhibit this issue.

For comparison, the test_xml_etree and test_xml_etree_c failures are not Kaniko-specific, and are instead associated with the version of expat installed.

We're seeing test_set fail in Docker building on Alpine 3.19 w/ expat 2.6.3

test_memoryview is also failing

Can you share the specific cases that are failing?

hroncok commented 1 month ago

RHEL 8 (and so I assume Rocky Linux 8) has not updated expat, but it has patched it instead. The failing tests are only failing because they check for expat version:

https://github.com/python/cpython/blob/eafd14fbe0fd464b9d700f6d00137415193aa143/Lib/test/test_pyexpat.py#L801 https://github.com/python/cpython/blob/eafd14fbe0fd464b9d700f6d00137415193aa143/Lib/test/test_sax.py#L1253 https://github.com/python/cpython/blob/eafd14fbe0fd464b9d700f6d00137415193aa143/Lib/test/test_xml_etree.py#L1740

vkurilin commented 1 month ago

Which distro are you using, and with which version of expat?

I'm gradually trimming the Dockerfile down to the minimal example. I've stopped at the latest version of debian:bookworm, but the problem also reproduces on ubi8 and rockylinux, so it seems that it's not dependent on the bistro.

The presence or absence of expat also does not affect the reproducibility of the error with test_hashlib (and primarily errors in test generation, as you correctly pointed out)

isuftin commented 1 month ago

Here's my Dockerfile in case anyone wants to verify the failing tests:

FROM alpine:3.19

ARG INSTALL_VERSION=3.13.0
ENV PATH="/usr/local/bin:${PATH}"
ENV LANG=C.UTF-8

# if this is called "PIP_VERSION", pip explodes with "ValueError: invalid truth value '<VERSION>'"
# https://pypi.org/project/pip/
ARG PYTHON_PIP_VERSION="24.2"
ARG SETUPTOOLS_VERSION="75.1.0"
ARG PYTHONDONTWRITEBYTECODE=1

SHELL ["/bin/ash", "-eo", "pipefail", "-c"]

# hadolint ignore=DL3003
RUN --mount=type=cache,target=/var/cache/apk \
    --mount=type=cache,target=/tmp \
    --mount=type=cache,target=/usr/src/python \
    # Install fetch dependencies
    apk add --no-cache --virtual .fetch-deps \
        tar~=1.35 \
        xz~=5.4.5 \
    \
    && mkdir -p /usr/src/python \
    # Fetch installation
    && wget -q -O /tmp/python.tar.xz "https://www.python.org/ftp/python/${INSTALL_VERSION%%[a-z]*}/Python-$INSTALL_VERSION.tar.xz" \
    && tar -xJC /usr/src/python --strip-components=1 -f /tmp/python.tar.xz \
    \
    # Delete fetch dependencies
    && apk del --no-network .fetch-deps && \
    \
    # Install build dependencies
    apk add --no-cache --virtual .build-deps \
        bluez-dev~=5.70 \
        bzip2-dev~=1.0.8 \
        coreutils~=9.4 \
        dpkg-dev~=1.22.1 \
        dpkg~=1.22 \
        expat-dev~=2.6.3 \
        findutils~=4.9.0 \
        gcc~=13.2.1 \
        gdbm-dev~=1.23 \
        gnupg~=2.4.4 \
        libc-dev~=0.7.2 \
        libffi-dev~=3.4.4 \
        libnsl-dev~=2.0.1 \
        make~=4.4 \
        ncurses-dev~=6.4 \
        openssl-dev~=~=3.1.7 \
        pax-utils~=1.3.7 \
        readline-dev~=8.2 \
        sqlite-dev~=3.44 \
        tcl-dev~=8.6.13 \
        tk-dev~=8.6.13 \
        tk~=8.6.13 \
        xz-dev~=5.4.5 && \
    \
    # Install dependencies
    apk add --no-cache \
        expat~=2.6.3 \
        # CVE-2022-1304
        libcom_err~=1.47 \
        libuuid~=2.39 \
        openssl~=3.1.7 \
        ssl_client~=1.36 \
    \
    # Build Python
    && cd /usr/src/python \
    && gnuArch="$(dpkg-architecture --query DEB_BUILD_GNU_TYPE)" \
    && ./configure \
        --build="$gnuArch" \
        --enable-loadable-sqlite-extensions \
        --enable-optimizations \
        --with-lto \
        --enable-option-checking=fatal \
        --enable-shared \
        --with-system-expat \
        --without-ensurepip \
    && make -j "$(nproc)" \
        # set thread stack size to 1MB so we don't segfault before we hit sys.getrecursionlimit()
        # https://github.com/alpinelinux/aports/commit/2026e1259422d4e0cf92391ca2d3844356c649d0
        EXTRA_CFLAGS="-DTHREAD_STACK_SIZE=0x100000" \
        LDFLAGS="-Wl,--strip-all" \
    && make install \
    \
    # Install run dependencies
    && find /usr/local -type f -executable -not \( -name '*tkinter*' \) -exec scanelf --needed --nobanner --format '%n#p' '{}' ';' \
        | tr ',' '\n' \
        | sort -u \
        | awk 'system("[ -e /usr/local/lib/" $1 " ]") == 0 { next } { print "so:" $1 }' \
        | xargs -rt apk add --no-cache --virtual .python-rundeps \
    \
    # Delete build dependencies
    && apk del --no-network .build-deps \
    \
    # Clean up installation
    && find /usr/local -depth \
        \( \
        \( -type d -a \( -name test -o -name tests -o -name idle_test \) \) \
        -o \
        \( -type f -a \( -name '*.pyc' -o -name '*.pyo' \) \) \
        \) -exec rm -rf '{}' + \
    \
    # Test for proper python installation
    && python3 --version | grep "${INSTALL_VERSION}" \
    \
    # Perform linking
    && ln -s /usr/local/bin/idle3 /usr/local/bin/idle \
    && ln -s /usr/local/bin/pydoc3 /usr/local/bin/pydoc \
    && ln -s /usr/local/bin/python3 /usr/local/bin/python \
    && ln -s /usr/local/bin/python3-config /usr/local/bin/python-config \
    \
    # Install PIP
    && wget -q -O /tmp/get-pip.py "https://bootstrap.pypa.io/get-pip.py" \
    && python /tmp/get-pip.py --disable-pip-version-check --no-cache-dir --no-compile \
        "setuptools==${SETUPTOOLS_VERSION}" \
        "pip==${PYTHON_PIP_VERSION}"

CMD ["python3"]

HEALTHCHECK NONE
JacobHenner commented 1 month ago

It turns out that the test_xml_etree* and test_generators failures are not new, they're just no longer ignored in PROFILE_TASK when they fail (as of Python 3.13). I looked at a build of 3.12.7, and test_xml_etree, test_xml_etree_c, and test_generators are all failing on RockyLinux 8, but this was not noticed earlier because the results were ignored during build.

edmorley commented 1 month ago

I'm seeing test_xml_etree + test_xml_etree_c failures during the PGO profile task on Ubuntu 22.04 with Python 3.13 too. (Plus like the above, I see those failures also occurred on older Python versions - the difference now is that they aren't being silently ignored during PGO.)

These tests pass for us on Ubuntu 20.04 and 24.04, just not 22.04. We build with --with-system-expat, so presuming (based on earlier comments) that expat is the issue, that means:

Example failing build log: https://github.com/heroku/heroku-buildpack-python/actions/runs/11259777638/job/31310276989#step:4:1374

The Dockerfile/scripts used to build: https://github.com/heroku/heroku-buildpack-python/blob/ebd222f304ed315e84b7c8c6b8a89898c05e88ca/builds/Dockerfile https://github.com/heroku/heroku-buildpack-python/blob/ebd222f304ed315e84b7c8c6b8a89898c05e88ca/builds/build_python_runtime.sh

xref: https://github.com/heroku/heroku-buildpack-python/pull/1661#issuecomment-2402921271

JacobHenner commented 1 month ago

I now believe the title of this issue is misleading. I've just confirmed that test_xml_etree, test_xml_etree_c, and test_generators all fail in the latest versions of Python 3.9-3.13. The reason this was noticed in 3.13 is because PROFILE_TASK failures are no longer ignored during build as of 3.13.

The specific test cases failures should probably be tracked separately. It's possible that users will see new build failures in other environments where the tests had always been failing, but not breaking the build.

vstinner commented 1 month ago

@vkurilin:

For 3.13.0 in CI, tests are failing both on rhel and debian.

Strange, we have Debian and RHEL buildbots and the whole test suite pass there.

edmorley commented 1 month ago

RHEL 8 (and so I assume Rocky Linux 8) has not updated expat, but it has patched it instead. The failing tests are only failing because they check for expat version:

The version checks not matching Ubuntu's backported versions appear to be the cause in one case (the failure in test_flush_reparse_deferral_disabled), however, two other tests are failing (test_simple_xml_chunk_1 and test_simple_xml_chunk_5) which don't have any expat version checks?

For failure output and links to the full logs, see: https://github.com/heroku/heroku-buildpack-python/pull/1661#issuecomment-2405259352

Also, cross-linking some prior history in this area:

edmorley commented 1 month ago

In the meantime we've had to resort to disabling the three affecting tests on Ubuntu 22.04 with Python 3.13.0: https://github.com/heroku/heroku-buildpack-python/pull/1661/commits/47c9f3a7566efe5946e286837ab0fc6952ba6478

hroncok commented 1 month ago

test_simple_xml_chunk_1 and test_simple_xml_chunk_5 used to have expat version check, I am not entirely sure why it was removed in https://github.com/python/cpython/pull/115623/commits/e5e403306c65199f53a02e33a2ac8e28eb45d7a1 (https://github.com/python/cpython/pull/115623)

edmorley commented 2 weeks ago

What are the next steps here? It would be great to have a fix for this for Python 3.13.1 so we don't need custom patches to build from source with PGO enabled :-)

vstinner commented 2 weeks ago

Can someone try if reverting https://github.com/python/cpython/commit/e5e403306c65199f53a02e33a2ac8e28eb45d7a1 fix the issue?

vstinner commented 2 weeks ago

Strange, we have Debian and RHEL buildbots and the whole test suite pass there.

Ah, RHEL8 buildbots don't use --with-system-expat but the embedded copy of libexpat: expat 2.6.3. In this case, test_xml_etree pass.

When using --with-system-expat on RHEL 8.10 (expat version 2.2.5), I get 3 test failures in test_xml_etree:

Can someone try if reverting https://github.com/python/cpython/commit/e5e403306c65199f53a02e33a2ac8e28eb45d7a1 fix the issue?

Reverting the change doesn't change anything for expat 2.2.5.

Before this change, test_simple_xml_chunk_1() and test_simple_xml_chunk_5() were skipped on expat 2.6 and newer.

vstinner commented 2 weeks ago

Please open a separated issue for test_generators and test_hashlib.

vstinner commented 2 weeks ago

@serhiy-storchaka: Would you mind to have a look at this expat issue? Tests fail on old expat 2.2.5.

serhiy-storchaka commented 2 weeks ago

There is no way to know whether reparse deferral was enabled in Expat. pyexpat maintains its own flag which is set by default to True if and only if Expat version is >= 2.6.0, and then update it after calling XML_SetReparseDeferralEnabled(). If reparse deferral was backported to Expat < 2.6.0, pyexpat still think that it is disabled. xmlparser.SetReparseDeferralEnabled is no-op because XML_SetReparseDeferralEnabled does not exist in Expat < 2.6.0. Therefore, we cannot disable it and test for disabled reparse deferral.

I do not see how we can fix this on our side without skipping some tests on vanilla Expat < 2.6.0. I think that if you use a patched Expat, you should also patch Python (at least tests) and all user code which depends on the old Expat behavior.

@hartwork, I afraid this issue will haunt us for years.

vstinner commented 2 weeks ago

expat 2.2.5 was released at November 1, 2017. expat 2.6.0 was released at February 6, 2024.

What do you think of skipping the 3 failing tests on expat < 2.6.0?

serhiy-storchaka commented 2 weeks ago

As far as I understand, this is not a vanilla Expat 2.2.5, but Expat 2.2.5 patched with some changes from 2.6. We do not want to skip tests on platforms that use Expat < 2.6 which does not include such changes (it may be a majority of computer for now).

I suggest to only skip them on platforms where the patched old Expat is used. We cannot know what changes were backported in the particular distribution, so we should leave this for maintainers of these distributions.

hartwork commented 2 weeks ago

We cannot know what changes were backported in the particular distribution, so we should leave this for maintainers of these distributions.

Full ack, with both my upstream and my downstream hat on.

vstinner commented 4 days ago

I suggest to close this issue as "WONT FIX".