kevin1024 / vcrpy

Automatically mock your HTTP interactions to simplify and speed up testing
MIT License
2.72k stars 388 forks source link

[5.1.0] tests/unit/test_serialize.py::test_serialize_binary_request fails if simplejson is installed in the environment #751

Closed mgorny closed 1 year ago

mgorny commented 1 year ago

When simplejson is installed in the system, it is used over the built-in json module. This causes the following test to fail:

$ python -m pytest
========================================================= test session starts =========================================================
platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.2.0
rootdir: /tmp/vcrpy
configfile: pyproject.toml
plugins: httpbin-2.0.0, cov-4.1.0
collected 255 items / 10 skipped                                                                                                      

tests/integration/test_basic.py .....                                                                                           [  1%]
tests/integration/test_config.py ...........                                                                                    [  6%]
tests/integration/test_disksaver.py ....                                                                                        [  7%]
tests/integration/test_filter.py ..........                                                                                     [ 11%]
tests/integration/test_ignore.py ....                                                                                           [ 13%]
tests/integration/test_matchers.py ..............                                                                               [ 18%]
tests/integration/test_multiple.py .                                                                                            [ 19%]
tests/integration/test_record_mode.py ........                                                                                  [ 22%]
tests/integration/test_register_matcher.py ....                                                                                 [ 23%]
tests/integration/test_register_persister.py ...                                                                                [ 25%]
tests/integration/test_register_serializer.py .                                                                                 [ 25%]
tests/integration/test_request.py ..                                                                                            [ 26%]
tests/integration/test_stubs.py ....                                                                                            [ 27%]
tests/integration/test_urllib2.py ..................                                                                            [ 34%]
tests/unit/test_cassettes.py ...............................                                                                    [ 47%]
tests/unit/test_errors.py ....                                                                                                  [ 48%]
tests/unit/test_filters.py ........................                                                                             [ 58%]
tests/unit/test_json_serializer.py .                                                                                            [ 58%]
tests/unit/test_matchers.py ............................                                                                        [ 69%]
tests/unit/test_migration.py ...                                                                                                [ 70%]
tests/unit/test_persist.py ....                                                                                                 [ 72%]
tests/unit/test_request.py .................                                                                                    [ 78%]
tests/unit/test_response.py ....                                                                                                [ 80%]
tests/unit/test_serialize.py .............F.                                                                                    [ 86%]
tests/unit/test_stubs.py ..                                                                                                     [ 87%]
tests/unit/test_unittest.py .........                                                                                           [ 90%]
tests/unit/test_vcr.py .......................                                                                                  [ 99%]
tests/unit/test_vcr_import.py .                                                                                                 [100%]

============================================================== FAILURES ===============================================================
____________________________________________________ test_serialize_binary_request ____________________________________________________

    def test_serialize_binary_request():
        msg = "Does this HTTP interaction contain binary data?"

        request = Request(method="POST", uri="http://localhost/", body=b"\x8c", headers={})

        try:
>           serialize({"requests": [request], "responses": [{}]}, jsonserializer)

tests/unit/test_serialize.py:111: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
vcr/serialize.py:59: in serialize
    return serializer.serialize(data)
vcr/serializers/jsonserializer.py:19: in serialize
    return json.dumps(cassette_dict, indent=4) + "\n"
.tox/py310/lib/python3.10/site-packages/simplejson/__init__.py:395: in dumps
    **kw).encode(obj)
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:300: in encode
    chunks = list(chunks)
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:714: in _iterencode
    for chunk in _iterencode_dict(o, _current_indent_level):
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:668: in _iterencode_dict
    for chunk in chunks:
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:544: in _iterencode_list
    for chunk in chunks:
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:668: in _iterencode_dict
    for chunk in chunks:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

dct = {'body': b'\x8c', 'headers': {}, 'method': 'POST', 'uri': 'http://localhost/'}, _current_indent_level = 4

    def _iterencode_dict(dct, _current_indent_level):
        if not dct:
            yield '{}'
            return
        if markers is not None:
            markerid = id(dct)
            if markerid in markers:
                raise ValueError("Circular reference detected")
            markers[markerid] = dct
        yield '{'
        if _indent is not None:
            _current_indent_level += 1
            newline_indent = '\n' + (_indent * _current_indent_level)
            item_separator = _item_separator + newline_indent
            yield newline_indent
        else:
            newline_indent = None
            item_separator = _item_separator
        first = True
        if _PY3:
            iteritems = dct.items()
        else:
            iteritems = dct.iteritems()
        if _item_sort_key:
            items = []
            for k, v in dct.items():
                if not isinstance(k, string_types):
                    k = _stringify_key(k)
                    if k is None:
                        continue
                items.append((k, v))
            items.sort(key=_item_sort_key)
        else:
            items = iteritems
        for key, value in items:
            if not (_item_sort_key or isinstance(key, string_types)):
                key = _stringify_key(key)
                if key is None:
                    # _skipkeys must be True
                    continue
            if first:
                first = False
            else:
                yield item_separator
            yield _encoder(key)
            yield _key_separator
            if isinstance(value, string_types):
                yield _encoder(value)
            elif _PY3 and isinstance(value, bytes) and _encoding is not None:
>               yield _encoder(value)
E               UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte

.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:633: UnicodeDecodeError

During handling of the above exception, another exception occurred:

    def test_serialize_binary_request():
        msg = "Does this HTTP interaction contain binary data?"

        request = Request(method="POST", uri="http://localhost/", body=b"\x8c", headers={})

        try:
            serialize({"requests": [request], "responses": [{}]}, jsonserializer)
        except (UnicodeDecodeError, TypeError) as exc:
>           assert msg in str(exc)
E           assert 'Does this HTTP interaction contain binary data?' in "'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte"
E            +  where "'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte" = str(UnicodeDecodeError('utf-8', b'\x8c', 0, 1, 'invalid start byte'))

tests/unit/test_serialize.py:113: AssertionError
======================================================= short test summary info =======================================================
FAILED tests/unit/test_serialize.py::test_serialize_binary_request - assert 'Does this HTTP interaction contain binary data?' in "'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte"
============================================= 1 failed, 254 passed, 10 skipped in 11.74s ==============================================
hartwork commented 1 year ago

@mgorny before having a closer look:

mgorny commented 1 year ago

@mgorny before having a closer look:

* Which versions of VCR.py are known affected or not affected, what did you use and try?

5.1.0 failed, 5.0.0 passed.

* Is there a related ticket In Gentoo that I just failed to find?

No, I noticed while bumping, so I deselected it.

Bisect says it's 4f70152e7ce510cde41cf071585cbdb481e4e8f2 (CC @jairhenrique):

commit 4f70152e7ce510cde41cf071585cbdb481e4e8f2 (HEAD)
Author:     Jair Henrique <jair.henrique@gmail.com>
AuthorDate: 2023-06-27 14:12:40 +0200
Commit:     Jair Henrique <jair.henrique@gmail.com>
CommitDate: 2023-06-27 22:36:26 +0200

    Enable rule B (flake8-bugbear) on ruff

Prior to this change, the exception is:

ValueError: 'utf-8' codec can't decode byte 0x45 in position 0: invalid start byteDoes this HTTP interaction contain binary data? If so, use a different serializer (like the yaml serializer) for this request?

After it, it is:

ValueError: 'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte

Without simplejson installed, it is:

ValueError: Does this HTTP interaction contain binary data? If so, use a different serializer (like the yaml serializer) for this request?

Perhaps the simplest solution would be to remove simplejson support entirely — I suspect it's only there for py2 support.

mgorny commented 1 year ago

Oh, and my educated guess is that this is the problematic part of the change:

diff --git a/vcr/serializers/jsonserializer.py b/vcr/serializers/jsonserializer.py
index 5ffef3e..55cf780 100644
--- a/vcr/serializers/jsonserializer.py
+++ b/vcr/serializers/jsonserializer.py
@@ -17,13 +17,5 @@ def serialize(cassette_dict):

     try:
         return json.dumps(cassette_dict, indent=4) + "\n"
-    except UnicodeDecodeError as original:  # py2
-        raise UnicodeDecodeError(
-            original.encoding,
-            b"Error serializing cassette to JSON",
-            original.start,
-            original.end,
-            original.args[-1] + error_message,
-        )
-    except TypeError:  # py3
-        raise TypeError(error_message)
+    except TypeError:
+        raise TypeError(error_message) from None

Note that it removes exception rewriting for "py2" case, and I guess simplejson falls into that case.

hartwork commented 1 year ago

@mgorny thanks for the additional details! :pray:

mgorny commented 1 year ago

Thanks!