frostming / unearth

A utility to fetch and download python packages
https://unearth.readthedocs.io
MIT License
141 stars 18 forks source link

json.decoder.JSONDecodeError during find_all_packages #89

Open paugier opened 10 months ago

paugier commented 10 months ago

On a Gitlab CI, I get a traceback using PDM (https://github.com/pdm-project/pdm/issues/2532). I think that the problem is related to unearth. The exception can be reproduced only with unearth:

$ python3.9 -c "from unearth import PackageFinder as F; f = F(index_urls=['https://pypi.org/simple/']); print(list(f.find_all_packages('flit-core')))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/finder.py", line 295, in find_all_packages
    self._find_packages(package_name, allow_yanked), hashes=hashes or {}
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/finder.py", line 275, in _find_packages
    return sorted(all_packages, key=self._sort_key, reverse=True)
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/collector.py", line 135, in collect_links_from_location
    yield from _collect_links_from_index(session, location)
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/collector.py", line 85, in parse_json_response
    data = json.loads(page.content)
  File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 10236 (char 10235)

Interestingly, (i) this code runs fine locally (and even locally in the same Docker image used for the CI) and (ii) I can install packages with pip in the Gitlab CI.

System (please complete the following information):

Additional context

Cause https://github.com/pdm-project/pdm/issues/2532

frostming commented 10 months ago

Since it is not reproducible, can you inspect what is the response content, around exactly this line:

  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/collector.py", line 85, in parse_json_response
    data = json.loads(page.content)

Print page.content and you can probably figure out what the problem is.

paugier commented 10 months ago
  File "/builds/fluiddyn/unearth/src/unearth/collector.py", line 88, in parse_json_response
    raise RuntimeError(page.content)
RuntimeError: b'{"files":[{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc1-py2.py3-none-any.whl","hashes":{"sha256":"1d717e7336997feed076c4f5dbdbe9ce45062e680f2b1de319b4c759f809a561"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84110,"upload-time":"2019-11-17T20:54:08.119802Z","url":"https://files.pythonhosted.org/packages/12/4f/8a0a7b2033b8a80451d214a289aecf486afdfb8e155b25986b0cbd3eb6e8/flit_core-2.0rc1-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc1.tar.gz","hashes":{"sha256":"d78f4b5b8fb2b484a98974b6da8d0edc8e7af55f60da7f40e0a9ddd2c36a5932"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22702,"upload-time":"2019-11-17T20:54:10.798088Z","url":"https://files.pythonhosted.org/packages/7f/8c/583b4412da71153ec70ed78341983c242a234d47abcfc8485284c6bb7b48/flit_core-2.0rc1.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc2-py2.py3-none-any.whl","hashes":{"sha256":"35a83504f509fcfd19bc53859d938cf2ad3385a2a19bfeb1745d1c957d39115c"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84124,"upload-time":"2019-11-17T21:01:19.976000Z","url":"https://files.pythonhosted.org/packages/d0/72/0fe258ce61fa1b59adb6c76a701b19a96e0033fbe054b297a2012e33ad44/flit_core-2.0rc2-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc2.tar.gz","hashes":{"sha256":"b34eef2a6da426c659b5bbfc7a18cbfba2a72bbf7dc20d75a15fd5fc90c1d937"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22697,"upload-time":"2019-11-17T21:01:22.111432Z","url":"https://files.pythonhosted.org/packages/2f/1b/41ac0da91712d9c3e7d06a6e1eb7dfe616c96a14e4036e9b9c37ea9ee6f8/flit_core-2.0rc2.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc3-py2.py3-none-any.whl","hashes":{"sha256":"9c5e882e51ddb4206626f576f0a8217ebdf011ab34aeb9d4bb91f101cad03981"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84135,"upload-time":"2019-11-19T09:38:40.419494Z","url":"https://files.pythonhosted.org/packages/6a/ff/be83d749ff1ad481b09e1e6069178c2d5d6c56a10b493353f0cc405e8475/flit_core-2.0rc3-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc3.tar.gz","hashes":{"sha256":"207a70987a60e67c475955996813ed95d485f97eee288d03fc04bff01b2c56b8"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22527,"upload-time":"2019-11-19T09:38:42.261378Z","url":"https://files.pythonhosted.org/packages/2d/36/bcd4bfb529261a27f113eb2e6fb9f5e5aed4d0b79be59a76ce65689a1892/flit_core-2.0rc3.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0-py2.py3-none-any.whl","hashes":{"sha256":"6315800ae208f0f1de1ee89997e16f69dacc5e18d3fd2a65e4e518e3d78dbdda"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84102,"upload-time":"2019-11-23T09:24:17.224079Z","url":"https://files.pythonhosted.org/packages/dc/81/1f336b50c81e5345aafe7469e4f4c1104faa82b76e6e9885456b47d898fe/flit_core-2.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.tar.gz","hashes":{"sha256":"8e91d877c663b16e70d88a2f652bc9e0ae71501cbb81c5ab8d48c838e731ba80"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22527,"upload-time":"2019-11-23T09:24:19.013574Z","url":"https://files.pythonhosted.org/packages/ec/cc/60e05480a5bf4b44ee1dbd179ca715ca4d192597d054e8c97bc0403060e8/flit_core-2.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.1-py2.py3-none-any.whl","hashes":{"sha256":"1eb2bf3fd805560ed3ad6abca365a03681d1bf1f7d80707dc3bc3ce6833d52f4"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84353,"upload-time":"2019-11-23T13:23:44.884523Z","url":"https://files.pythonhosted.org/packages/36/6a/b0e5ba2ad9d801887c8df7095535635292ce9b97f63cbb86f2b4d96dfebf/flit_core-2.0.1-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.1.tar.gz","hashes":{"sha256":"96e7708bc88c03b58e0d35f1171197737e701e29a901a8b49c13d3fd21866560"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22616,"upload-time":"2019-11-23T13:23:46.654078Z","url":"https://files.pythonhosted.org/packages/0e/3d/e9b28cd1d220ca635234e37567099bf4d50ea0a98a77b360b8d8042352e6/flit_core-2.0.1.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.2-py2.py3-none-any.whl","hashes":{"sha256":"c49546abb6afe371a13b78a2595d5afe1c0cd0aaa9dd753d800cd21259e51222"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84764,"upload-time":"2019-11-23T13:40:37.588869Z","url":"https://files.pythonhosted.org/packages/66/d2/c520657053052af580573e32aeafe50a9f68fc77c5d87ff551ca856d2aa3/flit_core-2.0.2-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.2.tar.gz","hashes":{"sha256":"9efcdca4ae84fd4d831e18d3cdb85a0b4f211a52d4b832408ff9a65bcc309928"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22764,"upload-time":"2019-11-23T13:40:39.483209Z","url":"https://files.pythonhosted.org/packages/89/cf/a76f37dfded167e97936b8d53308abe5a8d00b97d417a6a405e69167e685/flit_core-2.0.2.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.1.0-py2.py3-none-any.whl","hashes":{"sha256":"c6dff661e9e290d51084cefc38b0971d692290e8a352d0b6cec6006be764b4d1"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":39162,"upload-time":"2019-11-26T09:48:48.530001Z","url":"https://files.pythonhosted.org/packages/b6/b0/50719ef7d12cd39ccfa4e48abb593764c8e4a6d0d9bdf7815be1949142ff/flit_core-2.1.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.1.0.tar.gz","hashes":{"sha256":"d2ebad9351c34083c16388d1df64a6e19579affcec02bfc05746714eef9f82fb"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22978,"upload-time":"2019-11-26T09:48:49.922785Z","url":"https://files.pythonhosted.org/packages/6c/6a/f945cf72957752ba0655260a8cb9c1139ea134c5f4b104bc48027349a6f4/flit_core-2.1.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.2.0-py2.py3-none-any.whl","hashes":{"sha256":"4df2b9b43f00764a81e7ea742829749183a7f5a9e360fa5c3a9e8643dadd716a"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":40023,"upload-time":"2020-01-14T10:57:57.314090Z","url":"https://files.pythonhosted.org/packages/25/4c/0b1ed660937d96ed192c376d3983dd7b052b887c8041ae020c950c0d06f0/flit_core-2.2.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.2.0.tar.gz","hashes":{"sha256":"4efb8bffc1a04d8e550e877f0c9acf53109a021cc27c2a89b1b467715dc1d657"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":23131,"upload-time":"2020-01-14T10:57:59.011481Z","url":"https://files.pythonhosted.org/packages/77/72/5dda5dc417a4e702e0d7e4a77e9802792a0e4a2daec2aeed915ead7db477/flit_core-2.2.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.3.0-py2.py3-none-any.whl","hashes":{"sha256":"a8f8904b534966712390e0a2e434cd33f76037730a0aaed299a286f9e18cac2b"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":40020,"upload-time":"2020-04-08T08:04:01.308900Z","url":"https://files.pythonhosted.org/packages/4b/3c/82798771fc1fd978c9225c5ae25eef45cb23b0df4728f208024a5b57901f/flit_core-2.3.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.3.0.tar.gz","hashes":{"sha256":"a50bcd8bf5785e3a7d95434244f30ba693e794c5204ac1ee908fc07c4acdbf80"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22995,"upload-time":"2020-04-08T08:04:02.852440Z","url":"https://files.pythonhosted.org/packages/bb/92/e51c58d463ebbabb7b226662655cef6d17d3b4b83f570b08f6be0fe2b1b8/flit_core-2.3.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.0.0-py3-none-any.whl","hashes":{"sha256":"a787754978cfe3c192a5fc6baf2179ae85b05395804de7d7fe2864d9431e8d03"},"requires-python":">=3.4","size":36921,"upload-time":"2020-09-06T10:57:29.444835Z","url":"https://files.pythonhosted.org/packages/a8/66/67758f788959c2557c4d0f80e4895c3c0802873be95b82a5213ea39542d7/flit_core-3.0.0-py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.0.0.tar.gz","hashes":{"sha256":"a465052057e2d6d957e6850e9915245adedfc4fd0dd5737d0791bf3132417c2d"},"requires-python":">=3.4","size":22037,"upload-time":"2020-09-06T10:57:30.734781Z","url":"https://files.pythonhosted.org/packages/0e/b9/040baf94b40c80081bbecbd90365a5d7765a1c07e31b6c949838cc4c93d1/flit_core-3.0.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.1.0-py3-none-any.whl","hashes":{"sha256":"1d06e64a6af7e1fd1496563b160df29dd32714e00b473f3b763f6e6810476517"},"requires-python":">=3.4","size":38715,"upload-time":"2021-03-01T15:36:57.289033Z","url":"https://files.pythonhosted.org/packages/ed/0c/50352b127c0936cd59dd762db41d0e17986401c42ba613fa502e926d33ec/flit_core-3.1.0-py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.1.0.tar.gz","hashes":{"sha256":"22ff73be39a2b3c9e0692dfbbea3ad4a9d127e5733736a87dbb8ddcbf7309b1e"},"requires-python":">=3.4","size":22706,"upload-time":"2021-03-01T15:36:58.522778Z","url":"https://files.pythonhosted.org/packages/4c/8f/bed80c03f71cb3a2935882f391b53d2510c359191e5e0361650fa02d1365/flit_core-3.1.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.2.0-py3-none-any.whl","hashes":{"sha256":"6f25843e908dfc3e907b6b9ee71e3d185bcb5aebab8c3431e4e34c261e5ff1b5"},"requires-python":">=3.4","size":45693,"upload-time":"2021-03-21T21:20:19.175500Z","url":"http'

page.content ends with

[...],{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.2.0-py3-none-any.whl",
"hashes":{"sha256":"6f25843e908dfc3e907b6b9ee71e3d185bcb5aebab8c3431e4e34c261e5ff1b5"},
"requires-python":">=3.4","size":45693,"upload-time":"2021-03-21T21:20:19.175500Z","url":"http'

so indeed, an unterminated json text...

Questions:

paugier commented 10 months ago

Note that I can reproduce the exception with this simple code:

from datetime import datetime
from requests import Session

session = Session()

print("before get:", datetime.now())
resp = session.get(
    "https://pypi.org/simple/flit-core/",
    headers={
        "Accept": "application/vnd.pypi.simple.v1+json",
        "Cache-Control": "no-cache",
    },
    timeout=120,
)
print("after get:", datetime.now())

print(resp)
print(resp.content[-400:])
print(resp.json()["versions"])

which gives

before get: 2024-01-03 10:22:01.244619
after get: 2024-01-03 10:22:01.277003
<Response [200]>
b'g/packages/4c/8f/bed80c03f71cb3a2935882f391b53d2510c359191e5e0361650fa02d1365/flit_core-3.1.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.2.0-py3-none-any.whl","hashes":{"sha256":"6f25843e908dfc3e907b6b9ee71e3d185bcb5aebab8c3431e4e34c261e5ff1b5"},"requires-python":">=3.4","size":45693,"upload-time":"2021-03-21T21:20:19.175500Z","url":"http'
Traceback (most recent call last):
  File "/home/appuser/.local/lib/python3.9/site-packages/requests/models.py", line 960, in json
    return complexjson.loads(self.content.decode(encoding), **kwargs)
  File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 10236 (char 10235)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/builds/fluiddyn/fluidsim/tmp_bug_unearth.py", line 20, in <module>
    print(resp.json()["versions"])
  File "/home/appuser/.local/lib/python3.9/site-packages/requests/models.py", line 968, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Unterminated string starting at: line 1 column 10236 (char 10235)

Note that everything is fine with pip in the same environment (Gitlab CI). In particular pip index versions flit-core prints the correct data.

frostming commented 10 months ago

print(resp.content[-400:])

Did this line play an important role in reproducing the issue?

paugier commented 10 months ago

Did this line play an important role in reproducing the issue?

No. This line was only to visualize what happens, i.e. the response is truncated.

frostming commented 10 months ago

No. This line was only to visualize what happens, i.e. the response is truncated.

If requests itself can reproduce this, why not asking it there? I don't think there is any behavior of requests that can be tweaked via arguments to bypass this.