flathub-infra / flatpak-external-data-checker

A tool for checking if the external data used in Flatpak manifests is still up to date
GNU General Public License v2.0
116 stars 34 forks source link

test_checker: Don't try to fetch 10 random bytes as JSON #402

Open wjt opened 7 months ago

wjt commented 7 months ago

This test was added in a5a8ee61018d01e75b052a53e5cf2e262a012b88 and keeps failing nondeterministically in CI:

Traceback (most recent call last):
  File "/usr/lib/python3.11/unittest/async_case.py", line 90, in _callTestMethod
    if self._callMaybeAsync(method) is not None:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/unittest/async_case.py", line 112, in _callMaybeAsync
    return self._asyncioRunner.run(
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/home/runner/work/flatpak-external-data-checker/flatpak-external-data-checker/tests/test_checker.py", line 547, in test_get_json
    await checker._get_json("https://httpbingo.org/bytes/10")
  File "/home/runner/work/flatpak-external-data-checker/flatpak-external-data-checker/src/checkers/__init__.py", line 137, in _get_json
    return await response.json(content_type=None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/aiohttp/client_reqrep.py", line 1118, in json
    encoding = self.get_encoding()
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/aiohttp/client_reqrep.py", line 1072, in get_encoding
    encoding = chardet.detect(self._body)["encoding"]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/charset_normalizer/legacy.py", line 26, in detect
    r = from_bytes(byte_str).best()
        ^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/charset_normalizer/api.py", line 212, in from_bytes
    is_multi_byte_decoder: bool = is_multi_byte_encoding(encoding_iana)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/charset_normalizer/utils.py", line 256, in is_multi_byte_encoding
    importlib.import_module("encodings.{}".format(name)).IncrementalDecoder,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1175, in _find_and_load
  File "<frozen importlib._bootstrap>", line 171, in __enter__
  File "<frozen importlib._bootstrap>", line 123, in acquire
KeyError: 140113390399552

There is probably an actual bug here but, not today.

gasinvein commented 2 months ago

The error appears to be happening while guessing the charset. I suppose we can just disable charset detection in Checker._get_json() and either assume UTF-8 or use HTMLChecker._get_encoding() (which trusts the headers to specify correct encoding)?