plone / diazo

Diazo applies a static HTML theme to a dynamic website
http://diazo.org
Other
41 stars 26 forks source link

Handle empty responses by returning an unthemed response #9

Closed davidjb closed 12 years ago

davidjb commented 12 years ago

There's an issue present whereby if an empty text/html response is returned for transformation, a lxml.etree.XMLSyntaxError will be raised by lxml thusly:

  File "/opt/diazo/eggs/Paste-1.7.5.1-py2.6.egg/paste/urlmap.py", line 203, in __call__
    return app(environ, start_response)
  File "/opt/diazo/src/diazo/lib/diazo/wsgi.py", line 614, in __call__
    return transform_middleware(environ, start_response)
  File "/opt/diazo/src/diazo/lib/diazo/wsgi.py", line 328, in __call__
    serializer = getHTMLSerializer(response.app_iter, encoding=input_encoding)
  File "/opt/diazo/eggs/repoze.xmliter-0.5-py2.6.egg/repoze/xmliter/utils.py", line 32, in getHTMLSerializer
    doctype=doctype,
  File "/opt/diazo/eggs/repoze.xmliter-0.5-py2.6.egg/repoze/xmliter/utils.py", line 17, in getXMLSerializer
    root = p.close()
  File "parser.pxi", line 1171, in lxml.etree._FeedParser.close (src/lxml/lxml.etree.c:79886)
  File "parser.pxi", line 561, in lxml.etree._ParserContext._handleParseResult (src/lxml/lxml.etree.c:74504)
  File "parser.pxi", line 650, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:75458)
  File "parser.pxi", line 601, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:74958)
lxml.etree.XMLSyntaxError: None

Such responses can be produced from some applications that only consider data in a header and not the body - one such example is Jenkins CI.

This pull request catches this syntax error and returns an unthemed response, and provides tests for the same. The changes move the preparation of the serializer to be earlier in the WSGI processing to ensure the response is returned unchanged (eg line 260 of lib/diazo/uwsgi.py onwards).

lrowe commented 12 years ago

Do your responses have a Content-Length header? Maybe a Content-Length: 0 should be checked for in should_transform. Of course we need to handle the case where its missing too and not fall over here.

I think you may have inadvertently broken the intended behaviour for HEAD requests (it needs explicit testing) - a HEAD request needs to have its headers fixed up as the Content-Length etc for the equivalent GET will now be wrong but it will now cause a parse error and bail out at your exception handler.

I think you may have inadvertently pushed this to master already ;)

lrowe commented 12 years ago
I think you may have inadvertently pushed this to master already ;)

Sorry, I was looking at the wrong commit here, you've not pushed yet. I'll try and fix up the HEAD issue on your branch but would still be interested to hear whether your empty responses have a Content-Length.

davidjb commented 12 years ago

Just checked the response and there's no Content-Length header present. Complete headers are:

[('Server', 'Winstone Servlet Engine v0.9.10'), 
('X-ConsoleAnnotator', '3xhBI4vgk9h6DM0yYigQgJjG8GtHYh5tc5WyxcYVqyyGO+0xkKAu+bhe8DPCmQ46RGM8KfaYMrVkfgYY4Ow6XQNt+vi9d1oubkzHRN342iCsDy3/KdTaxcEzN6UHAaqW'),
('X-Text-Size', '917'),
('X-More-Data', 'true'),
('Date', 'Tue, 07 Aug 2012 22:47:51 GMT'),
('X-Powered-By', 'Servlet/2.5 (Winstone/0.9.10)'),
('Content-Type', 'text/html; charset=UTF-8')]