mysociety / citizenconnect

Citizen Connect project for the NHS: reporting problems, leaving reviews
https://www.nhs.uk/careconnect/choices
Other
1 stars 0 forks source link

All communication with the Choices API to retrieve reviews or ratings seems to be failing #1203

Closed stevenday closed 11 years ago

stevenday commented 11 years ago

Both fetching organisation ratings (run daily) and fetching new reviews of organisations (run hourly) have been failing for at least the last 48 hours, with an error that looks suspiciously like the choices api is returning a 503 for every url.

stevenday commented 11 years ago

Example errors:

Traceback (most recent call last):
  File "./manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/__init__.py", line 443, in execute_from_command_line
    utility.execute()
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/__init__.py", line 382, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/base.py", line 196, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/base.py", line 232, in execute
    output = self.handle(*args, **options)
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/base.py", line 371, in handle
    return self.handle_noargs(**options)
  File "/data/vhost/citizenconnect.mysociety.org/citizenconnect/reviews_display/management/commands/get_reviews_from_choices_api.py", line 39, in handle_noargs
    for review in reviews:
  File "/data/vhost/citizenconnect.mysociety.org/citizenconnect/reviews_display/reviews_api.py", line 59, in next
    self.load_next_page()
  File "/data/vhost/citizenconnect.mysociety.org/citizenconnect/reviews_display/reviews_api.py", line 211, in load_next_page
    reviews_from_xml = self.extract_reviews_from_xml(xml)
  File "/data/vhost/citizenconnect.mysociety.org/citizenconnect/reviews_display/reviews_api.py", line 166, in extract_reviews_from_xml
    root = ET.fromstring(xml)
  File "lxml.etree.pyx", line 2993, in lxml.etree.fromstring (src/lxml/lxml.etree.c:63285)
  File "parser.pxi", line 1617, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:93571)
  File "parser.pxi", line 1495, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:92370)
  File "parser.pxi", line 1011, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:89010)
  File "parser.pxi", line 577, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:84711)
  File "parser.pxi", line 676, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:85816)
  File "parser.pxi", line 616, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:85138)
lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: P line 6 and BODY, line 8, column 8

### output captured before 'get_reviews_from_choices_api' exited ###

And:

Error updating rating for Queen Mary's
mismatched tag: line 8, column 2Error updating rating for Kingston Town Children's Centre
mismatched tag: line 8, column 2Error updating rating for Colville Health Centre
mismatched tag: line 8, column 2Error updating rating for Health at the Stowe
mismatched tag: line 8, column 2Error updating rating for Raymede Clinic - St Charles Hospital
... [Snipped, it goes on for every organisation]

Note that I think the actual errors are related to: #1199 because the choices error page does have a malformed <p> closing tag in it.

BenJam commented 11 years ago

I've had the go ahead from LY, she'll confirm below.

Lynne2 commented 11 years ago

please go ahead with this

evdb commented 11 years ago

The remote end (Akamai, Server: AkamaiGHost) appears to 403 anything that does not have an user agent:

export CHOICES_API_URL="http://v1.syndication.nhschoices.nhs.uk/organisations/gppractices/commentssince/2013/9/19.atom?apikey=SECRET"

$ curl -D - $CHOICES_API_URL
HTTP/1.1 403 Forbidden
Server: AkamaiGHost
Content-Type: text/html
.....

$ curl -H "User-Agent: Foo" -D - $CHOICES_API_URL
HTTP/1.1 200 OK
Content-Type: application/atom+xml
......

The 403 response is the following invalid html (which as noted above leads to the unexpected error).

<HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD><BODY>
<H1>Access Denied</H1>

You don't have permission to access "http&#58;&#47;&#47;v1&#46;syndication&#46;nhschoices&#46;nhs&#46;uk&#47;organisations&#47;gppractices&#47;commentssince&#47;2013&#47;9&#47;19&#46;atom&#63;" on this server.<P>
Reference&#32;&#35;18&#46;8cdef50&#46;1380199413&#46;59de099
</BODY>
</HTML>

The last successful fetch was around the 19 Sept 2013, the last change to the citizenconnect codebase was the 16 September so it seems likely that this breakage was caused by a change to the proxy to the API.

I'll add a user agent to the requests that we send to the Choices API now, that should fix this issue.

evdb commented 11 years ago

Now deployed to production server, ratings and reviews are being fetched as expected.