Zeep version 4.1.0 installed via pip install zeep. I expect that if the response is entirely unparsable (perhaps because the server had an error and didn't even return XML), then the module should raise a TransportError with the message about invalid XML, even when in non-strict mode. What actually happens in non-strict mode is that we get an AttributeError raised by the internals of defusedxml.xml.fromstring.
Here is a script that you can use to see the two behaviors in action. First run this:
import pretend # pip install pretend
from zeep import Client
from zeep.transports import Transport
from zeep import Settings
def run(strict):
client = Client('http://www.dneonline.com/calculator.asmx?wsdl', settings=Settings(strict=strict))
response = pretend.stub(
status_code=200,
headers={},
content="""
Everything exploded! I am not XML at all.
""")
operation = client.service._binding._operations['Add']
result = client.service._binding.process_reply(
client, operation, response)
You can see the expected behavior, when strict = True by calling run(True):
>>> run(True)
Traceback (most recent call last):
File "/home/noahwork/.local/lib/python3.6/site-packages/zeep/loader.py", line 50, in parse_xml
elementtree = fromstring(content, parser=parser, base_url=base_url)
File "src/lxml/etree.pyx", line 3254, in lxml.etree.fromstring
File "src/lxml/parser.pxi", line 1913, in lxml.etree._parseMemoryDocument
File "src/lxml/parser.pxi", line 1793, in lxml.etree._parseDoc
File "src/lxml/parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc
File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
File "<string>", line 2
lxml.etree.XMLSyntaxError: Start tag expected, '<' not found, line 2, column 13
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/noahwork/.local/lib/python3.6/site-packages/zeep/wsdl/bindings/soap.py", line 204, in process_reply
doc = parse_xml(content, self.transport, settings=client.settings)
File "/home/noahwork/.local/lib/python3.6/site-packages/zeep/loader.py", line 67, in parse_xml
"Invalid XML content received (%s)" % exc.msg, content=content
zeep.exceptions.XMLSyntaxError: Invalid XML content received (Start tag expected, '<' not found, line 2, column 13)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in run
File "/home/noahwork/.local/lib/python3.6/site-packages/zeep/wsdl/bindings/soap.py", line 210, in process_reply
content=response.content,
zeep.exceptions.TransportError: Server returned response (200) with invalid XML: Invalid XML content received (Start tag expected, '<' not found, line 2, column 13).
Content: '\n Everything exploded! I am not XML at all.\n
Then the behavior when strict is False:
>>> run(strict=False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 11, in run
File "/home/noahwork/.local/lib/python3.6/site-packages/zeep/wsdl/bindings/soap.py", line 204, in process_reply
doc = parse_xml(content, self.transport, settings=client.settings)
File "/home/noahwork/.local/lib/python3.6/site-packages/zeep/loader.py", line 51, in parse_xml
docinfo = elementtree.getroottree().docinfo
AttributeError: 'NoneType' object has no attribute 'getroottree'
This happens because zeep.loader.parse_xml calls defusedxml.lxml.fromstring, which in turn has the lines
When strict is True, the first line raises an etree.XMLSyntaxError, subsequently caught in parse_xml and reraised as a zeep.exceptions.XMLSyntaxError (which is in turn caught outside and turned into a TransportError). When strict is False, the first line returns None, resulting in the generic AttributeError which is NOT caught in parse_xml.
I propose either of the following fixes:
Change the try...except block in parse_xml to also catch AttributeError exceptions. (Simplest, though I don't know if there are other situations which can raise this error).
Replace the call to defusedxml.lxml.fromstring in parse_xml with a local version of the function that does exactly the same thing except that it also checks whether the rootelement is None, and if so, raises a zeep.exceptions.XMLSyntaxError.
If either of the above is acceptable, I can provide a patch.
Zeep version 4.1.0 installed via pip install zeep. I expect that if the response is entirely unparsable (perhaps because the server had an error and didn't even return XML), then the module should raise a TransportError with the message about invalid XML, even when in non-strict mode. What actually happens in non-strict mode is that we get an AttributeError raised by the internals of defusedxml.xml.fromstring.
Here is a script that you can use to see the two behaviors in action. First run this:
You can see the expected behavior, when strict = True by calling run(True):
Then the behavior when strict is False:
This happens because zeep.loader.parse_xml calls defusedxml.lxml.fromstring, which in turn has the lines
When strict is True, the first line raises an etree.XMLSyntaxError, subsequently caught in parse_xml and reraised as a zeep.exceptions.XMLSyntaxError (which is in turn caught outside and turned into a TransportError). When strict is False, the first line returns None, resulting in the generic AttributeError which is NOT caught in parse_xml.
I propose either of the following fixes:
If either of the above is acceptable, I can provide a patch.