open-contracting / kingfisher-collect

Downloads OCDS data and stores it on disk
https://kingfisher-collect.readthedocs.io
BSD 3-Clause "New" or "Revised" License
13 stars 12 forks source link

zambia: EuropeanDynamicsBase should check for valid JSON #1053

Closed sentry-io[bot] closed 5 months ago

sentry-io[bot] commented 7 months ago

https://www.zppa.org.zm/ocds/services/recordpackage/getrecordpackagelist returns the following with HTTP status 200. Since we can't rely on the HTTP status, we should check whether the response is HTML, and if so treat it as if it is an error.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
        <head>
                <title>PROD</title>
                <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
                <meta http-equiv="Content-Language"     content="en-uk" />
                <style>
                body    {margin:0; padding:0; background:#dfe5eb;}
                p               {width:600px; margin:100px auto 0 auto; padding:130px 0 20px 0; box-shadow:0 5px 15
px -10px #476172; border-radius:0.5em;
                                font:normal 1em calibri,tahoma,arial,sans-serif; color:#36485a; text-align:center;
                                background:url(zppa-logo.png) #fff 50% 20px no-repeat;}
                </style>
        </head>

<body>
<p>The e-Procurement System (OCDS) is temporary unavailable due to maintenance.<br>Please click <a href="https://www.zppa.org.zm">here</a> to access the ZPPA portal.</p>
</body>
</html>
HTTP/1.1 200 OK
Date: Tue, 06 Feb 2024 22:23:23 GMT
Server: Apache/2.2.15 (Red Hat)
Last-Modified: Thu, 09 Jun 2016 11:55:37 GMT
ETag: "3c018c-445-534d719ae0040"
Accept-Ranges: bytes
Content-Length: 1093
X-Cnection: close
Content-Type: text/html; charset=UTF-8

Sentry Issue: REGISTRY-KINGFISHER-COLLECT-23

JSONDecodeError: Expecting value: line 1 column 1 (char 0)
(9 additional frame(s) were not displayed)
...
  File "kingfisher_scrapy/spidermiddlewares.py", line 287, in process_spider_exception
    raise exception
  File "kingfisher_scrapy/util.py", line 104, in wrapper
    yield from decorated(self, response, **kwargs)
  File "kingfisher_scrapy/spiders/european_dynamics_base.py", line 47, in parse_list
    for number, url in enumerate(reversed(response.json()['packagesPerMonth'])):

Spider error processing %(request)s (referer: %(referer)s)