mysociety / citizenconnect

Citizen Connect project for the NHS: reporting problems, leaving reviews
https://www.nhs.uk/careconnect/choices
Other
1 stars 0 forks source link

Check response code before processing XML from the Choices API #1199

Closed stevenday closed 11 years ago

stevenday commented 11 years ago

Currently we're getting errors like:

Traceback (most recent call last):
  File "./manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/__init__.py", line 443, in execute_from_command_line
    utility.execute()
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/__init__.py", line 382, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/base.py", line 196, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/base.py", line 232, in execute
    output = self.handle(*args, **options)
  File "/data/vhost/citizenconnect.mysociety.org/virtualenv-citizenconnect/lib/python2.6/site-packages/django/core/management/base.py", line 371, in handle
    return self.handle_noargs(**options)
  File "/data/vhost/citizenconnect.mysociety.org/citizenconnect/reviews_display/management/commands/fetch_reviews_from_choices_api.py", line 34, in handle_noargs
    for review in reviews:
  File "/data/vhost/citizenconnect.mysociety.org/citizenconnect/reviews_display/reviews_api.py", line 59, in next
    self.load_next_page()
  File "/data/vhost/citizenconnect.mysociety.org/citizenconnect/reviews_display/reviews_api.py", line 211, in load_next_page
    reviews_from_xml = self.extract_reviews_from_xml(xml)
  File "/data/vhost/citizenconnect.mysociety.org/citizenconnect/reviews_display/reviews_api.py", line 166, in extract_reviews_from_xml
    root = ET.fromstring(xml)
  File "lxml.etree.pyx", line 2993, in lxml.etree.fromstring (src/lxml/lxml.etree.c:63285)
  File "parser.pxi", line 1617, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:93571)
  File "parser.pxi", line 1495, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:92370)
  File "parser.pxi", line 1011, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:89010)
  File "parser.pxi", line 577, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:84711)
  File "parser.pxi", line 676, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:85816)
  File "parser.pxi", line 616, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:85138)
lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: P line 6 and BODY, line 8, column 8

### output captured before 'fetch_reviews_from_choices_api' exited ###

Which I believe is due to the code trying to parse the NHS choices error page (for whatever reason it has arrived at that). This page returns a 503 status code, so we shouldn't even be trying to process it, rather just raising an error about it returning a 503.

evdb commented 11 years ago

The change to using urllib2 for all requests means that an HTTPError will be raised in the above situations.

However this mean that the handling for empty responses (which the API poorly implements using 404 responses) will instead need to change to catch the exception, test the code and act appropriately.

evdb commented 11 years ago

Sample error report now produced:

Traceback (most recent call last):
  .... snip ....
  File "/data/vhost/citizenconnect.test.mysociety.org/citizenconnect/reviews_display/reviews_api.py", line 74, in fetch_from_api
    response = self.api.send_api_request(url)
  File "/data/vhost/citizenconnect.test.mysociety.org/citizenconnect/organisations/choices_api.py", line 46, in send_api_request
  .... snip ....
  File "/usr/lib/python2.6/urllib2.py", line 518, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 500: Internal Server Error