mysociety / citizenconnect

Citizen Connect project for the NHS: reporting problems, leaving reviews
https://www.nhs.uk/careconnect/choices
Other
1 stars 0 forks source link

NHS Choices API (preview) next page error #1207

Open evdb opened 11 years ago

evdb commented 11 years ago

In our cron scripts for vhosts using the http://v1.syndication.nhschoicespreview.nhs.uk/ endpoint (so testing, uat) we are getting the following errors. This behaviour is not seen on the production Choices API.

Traceback (most recent call last):
  .... snip ....
  File "/data/vhost/citizenconnect-uat.staging.mysociety.org/citizenconnect/organisations/choices_api.py", line 46, in send_api_request
    response = opener.open(url)
  ... snip ...
  File "/usr/lib/python2.6/urllib2.py", line 518, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 500: Internal Server Error

Manually calling the urls that this code is following reveals that the last page of results is 500ing:

Penultimate page returns 200 with this "next page" link tag:

GET http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23.atom?apikey=SECRET&page=5

....
<link rel="next" type="application/atom+xml" title="next" length="1000" href="http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23?apikey=SECRET&amp;page=6" />
....

Last page is 500, this is complete error (appears to be valid XML with results, followed by an appended HTML error message):

GET http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23.atom?apikey=SECRET&page=6

<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><title type="text">NHS Choices - Comments Since 23 September 2013 - Page 6 of 6</title><id>uuid:5361ce3e-d12b-4c18-8c9f-4e99d626da3e;id=21</id><rights type="text">© Crown Copyright 2009</rights><updated>2013-09-23T11:44:17+01:00</updated><category term="CommentsAndRatings" /><logo>http://www.nhs.uk/nhscwebservices/documents/logo1.jpg</logo><author><name>NHS Choices</name><uri>http://www.nhs.uk</uri><email>webservices@nhschoices.nhs.uk</email></author><link rel="self" type="application/atom+xml" title="NHS Choices - Comments Since 23 September 2013 - Page 6 of 6" href="http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23?apikey=SECRET" /><link rel="first" type="application/atom+xml" title="first" length="1000" href="http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23?apikey=SECRET&amp;page=1" /><link rel="prev" type="application/atom+xml" title="prev" length="1000" href="http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23?apikey=SECRET&amp;page=5" /><link rel="last" type="application/atom+xml" title="last" length="1000" href="http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23?apikey=SECRET&amp;page=6" /><tracking xmlns="http://syndication.nhschoices.nhs.uk/services">&lt;img style="border: 0; width: 1px; height: 1px;" alt="" src="http://statse.webtrendslive.com/dcss9yzisf9xjyg74mgbihg8p_8d2u/njs.gif?dcsuri=/organisations%2fhospitals%2fcommentssince%2f2013%2f9%2f23&amp;amp;wt.js=no&amp;amp;wt.cg_n=syndication"/&gt;</tracking><entry><id>231345</id><title type="text">Item 231345 deleted</title><published>2013-09-23T11:44:17+01:00</published><updated>2013-09-23T11:44:17+01:00</updated><author><name>NHS Choices</name><uri>http://www.nhs.uk</uri><email>webservices@nhschoices.nhs.uk</email></author><link rel="related" type="application/xhtml+xml" title="Organisation commented on" length="1000" href="http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/40342?apikey=SECRET" /><link rel="related" type="application/xhtml+xml" title="General Surgery" length="1000" href="http://v1.syndication.nhschoicespreview.nhs.uk/services/types/srv0045/co92035?apikey=SECRET" /><category term="deletion" label="Whether this is a comment, reply or deletion" scheme="commentType" /><content type="text">Item  has been deleted and should be removed from your cache</content><postingid xmlns="http://syndication.nhschoices.nhs.uk/schemas/comments">231345</postingid><postingorganisationid xmlns="http://syndication.nhschoices.nhs.uk/schemas/comments">0</postingorganisationid></entry><entry><id>231337</id><title type="text">Ward 17</title><published>2013-09-23T09:22:46+01:00</published><updated>2013-09-23T09:32:36+01:00</updated><author><name>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" >
<head>
    <title>NHS Choices Syndication</title>
    <meta http-equiv="Content-Style-Type" content="text/css" />

</head>
<body>
    <div id="MainContent">
        <img style="border: 0; width: 1px; height: 1px;" alt="" src="http://statse.webtrendslive.com/dcss9yzisf9xjyg74mgbihg8p_8d2u/njs.gif?dcsuri=/&amp;wt.js=no&amp;wt.cg_n=syndication"/>

    <h1>NHS Choices Syndication - Status Code 500</h1>
    <h2>500 InternalServerError</h2>
    <p>OperationResult: type=InternalServerError, statusCode=500</p>

    </div>
</body>
</html>

Will contact Choices team to see if we are requesting this page incorrectly.

evdb commented 11 years ago

Have tried submitting this report to the NHS Choices team through their preferred route (web form at https://www.nhs.uk/aboutnhschoices/Pages/Feedback.aspx?iType=tech) and it has timed out twice, returned no data once. If I don't get an email autoresponse in the next hour or so I'll try to contact them via email.

evdb commented 11 years ago

Have emailed directly to Anil as well.

evdb commented 11 years ago

Now have NHS Choices Service Desk issues numbers for these reports:

evdb commented 11 years ago

I think my assertion that this was only an issue on the last page of results was wrong. It appears that there might be some bad data in the results that cause the 500 error. The following urls show that it is now the penultimate page that has the errors:

GET http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23.atom?apikey=SECRET&page=17
200 OK

GET http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23.atom?apikey=SECRET&page=18
500 Internal Server Error

GET http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23.atom?apikey=SECRET&page=19
200 OK

GET http://v1.syndication.nhschoicespreview.nhs.uk/organisations/hospitals/commentssince/2013/9/23.atom?apikey=SECRET&page=20
404 Not Found
stevenday commented 11 years ago

I think this is the same issue as #1125 describes - pages can randomly 404 at any point, and they're ok with that, so we need to make the command more robust to these - which is unfortunately going to break the nice iteration model we have at the moment.

evdb commented 11 years ago

@stevenday perhaps - but I'm seeing 500 errors instead of 404. But perhaps that is the proxy sitting in front of the production API altering the responses, which does not happen on the preview API?

There certainly seem to be a fair few problems, and the proposed solution in #1125 would make a lot of sense to minimise the effects of this here too.