aio-libs / yarl

Yet another URL library
https://yarl.aio-libs.org
Apache License 2.0
1.33k stars 167 forks source link

Tests failing in a new environment with an exception (`ValueError: chr() arg not in range(0x110000)`) #1411

Open markgreene74 opened 3 days ago

markgreene74 commented 3 days ago

Please confirm the following

Describe the bug

Working on setting up a dev environment for the Sprint @ Man Group London Hackathon 2024, Nov 9 with @ajsanchezsanz and trying to run pytest, we got the following error:

>   return <Py_UCS4>-1
E   ValueError: chr() arg not in range(0x110000)

yarl/_quoting_c.pyx:55: ValueError

To Reproduce

Please note: pyenv is used in this example, but the same problem can be observed as well in a brand new GitHub codespace.

Expected behavior

Tests to pass and no exception raised

Logs/tracebacks

Excerpt of the traceback:

zsh ❯ pytest ./tests ./yarl -v
=================================== test session starts ===================================
platform darwin -- Python 3.12.4, pytest-8.3.3, pluggy-1.5.0 -- /Users/gc/.pyenv/versions/3.12.4/envs/yarl-env/bin/python
codspeed: 3.0.0 (disabled, mode: walltime, timer_resolution: 41.7ns)
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase(PosixPath('/Users/gc/github/yarl/.hypothesis/examples'))
rootdir: /Users/gc/github/yarl
configfile: pytest.ini
plugins: cov-6.0.0, codspeed-3.0.0, xdist-3.6.1, hypothesis-6.118.0

(...)

_________________ test_quote_percent_non_ascii3_percent_encoded[c_quoter] _________________
[gw1] darwin -- Python 3.12.4 /Users/gc/.pyenv/versions/3.12.4/envs/yarl-env/bin/python

quoter = <class 'yarl._quoting_c._Quoter'>

    def test_quote_percent_non_ascii3_percent_encoded(quoter):
>       assert quoter()("%🐍%3f") == "%25%F0%9F%90%8D%3F"

quoter     = <class 'yarl._quoting_c._Quoter'>

tests/test_quoting.py:393:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
yarl/_quoting_c.pyx:220: in yarl._quoting_c._Quoter.__call__
    return self._do_quote_or_skip(<str>val)
yarl/_quoting_c.pyx:245: in yarl._quoting_c._Quoter._do_quote_or_skip
    return self._do_quote(<str>val, length, kind, data, &writer)
yarl/_quoting_c.pyx:265: in yarl._quoting_c._Quoter._do_quote
    ch = _restore_ch(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   return <Py_UCS4>-1
E   ValueError: chr() arg not in range(0x110000)

yarl/_quoting_c.pyx:55: ValueError
__________________________________ test_space[c_quoter] ___________________________________
[gw3] darwin -- Python 3.12.4 /Users/gc/.pyenv/versions/3.12.4/envs/yarl-env/bin/python

quoter = <class 'yarl._quoting_c._Quoter'>

    def test_space(quoter):
        s = "% A"
>       assert quoter()(s) == "%25%20A"

quoter     = <class 'yarl._quoting_c._Quoter'>
s          = '% A'

tests/test_quoting.py:470:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
yarl/_quoting_c.pyx:220: in yarl._quoting_c._Quoter.__call__
    return self._do_quote_or_skip(<str>val)
yarl/_quoting_c.pyx:245: in yarl._quoting_c._Quoter._do_quote_or_skip
    return self._do_quote(<str>val, length, kind, data, &writer)
yarl/_quoting_c.pyx:265: in yarl._quoting_c._Quoter._do_quote
    ch = _restore_ch(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   return <Py_UCS4>-1
E   ValueError: chr() arg not in range(0x110000)

yarl/_quoting_c.pyx:55: ValueError
____________________________ test_quote_all_percent[c_quoter] _____________________________
[gw1] darwin -- Python 3.12.4 /Users/gc/.pyenv/versions/3.12.4/envs/yarl-env/bin/python

quoter = <class 'yarl._quoting_c._Quoter'>

    def test_quote_all_percent(quoter):
>       assert quoter()("%%%%") == "%25%25%25%25"

quoter     = <class 'yarl._quoting_c._Quoter'>

tests/test_quoting.py:405:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
yarl/_quoting_c.pyx:220: in yarl._quoting_c._Quoter.__call__
    return self._do_quote_or_skip(<str>val)
yarl/_quoting_c.pyx:245: in yarl._quoting_c._Quoter._do_quote_or_skip
    return self._do_quote(<str>val, length, kind, data, &writer)
yarl/_quoting_c.pyx:265: in yarl._quoting_c._Quoter._do_quote
    ch = _restore_ch(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   return <Py_UCS4>-1
E   ValueError: chr() arg not in range(0x110000)

yarl/_quoting_c.pyx:55: ValueError
_____________________________ test_with_port_percent_encoded ______________________________
[gw1] darwin -- Python 3.12.4 /Users/gc/.pyenv/versions/3.12.4/envs/yarl-env/bin/python

    def test_with_port_percent_encoded():
>       url = URL("http://user%name:pass%word@example.com/")

tests/test_url_update_netloc.py:273:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
yarl/_url.py:307: in __new__
    return pre_encoded_url(val) if encoded else encode_url(val)
        cls        = <class 'yarl.URL'>
        encoded    = False
        strict     = None
        val        = 'http://user%name:pass%word@example.com/'
yarl/_url.py:177: in encode_url
    raw_user = REQUOTER(username) if username else username
        cache      = {'explicit_port': None, 'raw_host': 'example.com'}
        fragment   = ''
        host       = 'example.com'
        netloc     = 'user%name:pass%word@example.com'
        password   = 'pass%word'
        path       = '/'
        port       = None
        query      = ''
        scheme     = 'http'
        url_str    = 'http://user%name:pass%word@example.com/'
        username   = 'user%name'
yarl/_quoting_c.pyx:220: in yarl._quoting_c._Quoter.__call__
    return self._do_quote_or_skip(<str>val)
yarl/_quoting_c.pyx:245: in yarl._quoting_c._Quoter._do_quote_or_skip
    return self._do_quote(<str>val, length, kind, data, &writer)
yarl/_quoting_c.pyx:265: in yarl._quoting_c._Quoter._do_quote
    ch = _restore_ch(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   return <Py_UCS4>-1
E   ValueError: chr() arg not in range(0x110000)

yarl/_quoting_c.pyx:55: ValueError

(...)

================================= short test summary info =================================
XFAIL tests/test_quoting.py::test_unquoting_invalid_utf8_sequence[py_unquoter-%AB] -
    FIXME: After conversion to bytes, should not cause UTF-8 decode fail.
    See https://url.spec.whatwg.org/#percent-encoded-bytes

    Refs:
    * https://github.com/aio-libs/yarl/pull/216
    * https://github.com/aio-libs/yarl/pull/214
    * https://github.com/aio-libs/yarl/pull/7

XFAIL tests/test_quoting.py::test_unquoting_invalid_utf8_sequence[py_unquoter-%AB%AB] -
    FIXME: After conversion to bytes, should not cause UTF-8 decode fail.
    See https://url.spec.whatwg.org/#percent-encoded-bytes

    Refs:
    * https://github.com/aio-libs/yarl/pull/216
    * https://github.com/aio-libs/yarl/pull/214
    * https://github.com/aio-libs/yarl/pull/7

XFAIL tests/test_quoting.py::test_unquoting_invalid_utf8_sequence[c_unquoter-%AB] -
    FIXME: After conversion to bytes, should not cause UTF-8 decode fail.
    See https://url.spec.whatwg.org/#percent-encoded-bytes

    Refs:
    * https://github.com/aio-libs/yarl/pull/216
    * https://github.com/aio-libs/yarl/pull/214
    * https://github.com/aio-libs/yarl/pull/7

XFAIL tests/test_quoting.py::test_unquoting_invalid_utf8_sequence[c_unquoter-%AB%AB] -
    FIXME: After conversion to bytes, should not cause UTF-8 decode fail.
    See https://url.spec.whatwg.org/#percent-encoded-bytes

    Refs:
    * https://github.com/aio-libs/yarl/pull/216
    * https://github.com/aio-libs/yarl/pull/214
    * https://github.com/aio-libs/yarl/pull/7

FAILED tests/test_quoting.py::test_quote_not_allowed_non_strict[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_unquoting_bad_percent_escapes[c_unquoter-%2x-%2x] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_unquoting_bad_percent_escapes[c_unquoter-%2 -%2 ] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_quote_percent_percent_encoded[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_unquoting_bad_percent_escapes[c_unquoter-% 2-% 2] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_unquoting_bad_percent_escapes[c_unquoter-%xa-%xa] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_quote_percent_digit_percent_encoded[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_quote_percent_safe_percent_encoded[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_unquoting_bad_percent_escapes[c_unquoter-%%3f-%?] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_quote_percent_unsafe_percent_encoded[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_unquoting_bad_percent_escapes[c_unquoter-%2%-%2%] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_unquoting_bad_percent_escapes[c_unquoter-%2%3f-%2?] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_quote_percent_non_ascii_percent_encoded[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_unquoting_bad_percent_escapes[c_unquoter-%x%3f-%x?] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_quote_percent_non_ascii2_percent_encoded[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_quote_percent_non_ascii3_percent_encoded[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_unquoting_bad_percent_escapes[c_unquoter-%\u20ac%3f-%\u20ac?] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_quote_all_percent[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_quoting.py::test_space[c_quoter] - ValueError: chr() arg not in range(0x110000)
FAILED tests/test_url_update_netloc.py::test_with_port_percent_encoded - ValueError: chr() arg not in range(0x110000)
======================= 20 failed, 1416 passed, 4 xfailed in 5.70s ========================

Python Version

zsh ❯ python --version
Python 3.12.4

multidict Version

zsh ❯ python -m pip show multidict
Name: multidict
Version: 6.1.0
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache 2
Location: /Users/gc/.pyenv/versions/yarl-env/lib/python3.12/site-packages
Requires:
Required-by: yarl

propcache Version

zsh ❯ python -m pip show propcache
Name: propcache
Version: 0.2.0
Summary: Accelerated property cache
Home-page: https://github.com/aio-libs/propcache
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: /Users/gc/.pyenv/versions/yarl-env/lib/python3.12/site-packages
Requires:
Required-by: yarl

yarl Version

zsh ❯ python -m pip show yarl
Name: yarl
Version: 1.17.2.dev0
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl
Author: Andrew Svetlov
Author-email: andrew.svetlov@gmail.com
License: Apache-2.0
Location: /Users/gc/.pyenv/versions/yarl-env/lib/python3.12/site-packages
Editable project location: /Users/gc/github/yarl
Requires: idna, multidict, propcache
Required-by:

OS

macOS 15.1

Additional context

zsh ❯ pip show cython
Name: Cython
Version: 3.0.11
Summary: The Cython compiler for writing C extensions in the Python language.
Home-page: https://cython.org/
Author: Robert Bradshaw, Stefan Behnel, Dag Seljebotn, Greg Ewing, et al.
Author-email: cython-devel@python.org
License: Apache-2.0
Location: /Users/gc/.pyenv/versions/3.12.4/envs/yarl-env/lib/python3.12/site-packages
Requires:
Required-by:
markgreene74 commented 3 days ago

@webknjaz found this awesome workaround based on @bdraco 's idea that this was related to the pre-release of cython:

PIP_CONSTRAINT=requirements/cython.txt pip install -e . 
markgreene74 commented 3 days ago

@ajsanchezsanz and I have been doing some tests to find exactly which cython version(s) cause the problem.

markgreene74 commented 3 days ago

Further testing, the problem seems to be somehow related to pytest-cov.

Running pytest tests/test_quoting.py::test_space -vvv -p no:pytest_cov, which disables the plugin, pass.

I created a simple proof of concept file (poc.py) with the following content which reproduces the error:

from yarl._quoting_c import _Quoter as _CQuoter
print(_CQuoter()("%%%%") == "%25%25%25%25")

To reproduce the error run:

$ coverage run poc.py 
Traceback (most recent call last):
  File "/workspaces/yarl/poc.py", line 2, in <module>
    print(_CQuoter()("%%%%") == "%25%25%25%25")
          ^^^^^^^^^^^^^^^^^^
  File "yarl/_quoting_c.pyx", line 220, in yarl._quoting_c._Quoter.__call__
    return self._do_quote_or_skip(<str>val)
  File "yarl/_quoting_c.pyx", line 245, in yarl._quoting_c._Quoter._do_quote_or_skip
    return self._do_quote(<str>val, length, kind, data, &writer)
  File "yarl/_quoting_c.pyx", line 265, in yarl._quoting_c._Quoter._do_quote
    ch = _restore_ch(
  File "yarl/_quoting_c.pyx", line 55, in yarl._quoting_c._restore_ch
    return <Py_UCS4>-1
ValueError: chr() arg not in range(0x110000)