Strategies for dealing with broken clients? (invalid documents)

I have a client I can't change, which submits a malformed SOAP doc. The client assumes that there's already a namespace called soap with the uri http://schemas.xmlsoap.org/soap/. Who knows why the original server accepted this. I'm not in a position to judge or fix, only make something bug for bug compatible.

Let's demonstrate the behavior in a simplified way.

from lxml import etree
s = "<soapenv:Header><soap:authentication><soap:username>some_user</soap:username><soap:password>some_pass</soap:password></soap:authentication></soapenv:Header>" 
etree.fromstring(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "src/lxml/etree.pyx", line 3237, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1896, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1777, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
  File "<string>", line 1
etree.XMLSyntaxError: Namespace prefix soapenv on Header is not defined, line 1, column 16

I had originally gone down the path of trying to use the listeners, like before_deserialize, but all of those seem to occur after lxml.XMLID tries to parse the object and throws a fault (https://github.com/arskom/spyne/blob/354723434db322e87a37b98a8a62b731954651e7/spyne/protocol/soap/soap11.py#L112).

I ended up monkey patching Soap11's _parse_xml_string to do some nasty string manipulation with some regex's to find out if the namespace was specified in the envelope, and if not, modify the incoming string to include it.

Will this fail on super large payloads? Probably (my payloads are less than 4k, it's not going to be an issue)
Do I think this was the right thing to do? Not really, but the POC works.

It makes sense that an invalid XML doc would cause a failure, but I'm wondering if there are other strategies I should be considering instead of this approach. Or perhaps it was the intent of those listeners in the Soap11 object to help cope with bad document and the functionality was lost over time. If it's the former, I'm all ears, if it's the later, I'm not totally sure how I would fix it.

I could see a path where create_in_document fires an event that allows you to manipulate the xml string before ctx.in_document gets a parsed XML object, and I would take a swing a PR for that functionality if anyone sees merit in it.

Thoughts?

Dealing with broken clients is just normal business for a soap server.

I always thought that stuff that needs to happen before create_in_document() needs to happen in the transport layer, because at that stage of request processing, you're effectively dealing with just a byte stream, and juggling bytes around is the job of the transports. So anything you do is going to be transport-specific.

You could add a one-liner to fire an event before create_in_document but you'd have to be doing transport-specific work in it anyway. So I'm not sure that's the best thing to do. Would you agree?

You can have a look at examples/events.py to see how the 'wsgi_call' hook could be used to modify docs before xml parsing takes place.

I may be approaching my limitations and I don't want to turn this into a help forum, but I do see value in sharing what I learned.

I'm having trouble working out how I'd manipulate the context object. wsgi_call gets you an object that looks like

WsgiMethodContext(
    call_start=1629217054.327723, 
    call_end=None, 
    is_closed=False, 
    app=<spyne.application.Application object at 0x10ae31d30>, 
    udc=None, 
    transport=<spyne.server.wsgi.WsgiTransportContext object at 0x10aed62b0>, 
    outprot_ctx=<spyne.context.ProtocolContext object at 0x10aec5f40>, 
    inprot_ctx=<spyne.context.ProtocolContext object at 0x10aec5fa0>, 
    protocol=<spyne.context.ProtocolContext object at 0x10aec5fa0>, 
    event=<spyne.context.EventContext object at 0x10aed6250>, 
    aux=None, 
    method_request_string=None, 
    files=[], 
    active=False, 
    _MethodContext__descriptor=None, 
    in_string=None, 
    in_document=None, 
    in_header_doc=None, 
    in_body_doc=None, 
    in_error=None, 
    in_header=None, 
    in_object=None, 
    out_object=None, 
    out_header=None, 
    out_error=None, 
    out_body_doc=None, 
    out_header_doc=None, 
    out_document=None, 
    out_string=None, 
    out_stream=None, 
    function=None, 
    locale=None, 
    _in_protocol=<spyne.protocol.soap.soap11.Soap11 object at 0x10a5bafd0>, 
    _out_protocol=<spyne.protocol.soap.soap11.Soap11 object at 0x10ae311f0>, 
    pusher_stack=[], 
    frozen=True, 
))

If I dig into the transport object, there is a req object, which has the byte stream of the payload in wsgi.input, but that's a BufferedReader. I've had trouble "materializing" the buffered reader without causing hangs, but this is where I'd point back to my python limitations. Code like this hangs

def _on_wsgi_call(ctx):
    print("Start _on_wsgi_call")

    istream = ctx.transport.req.get('wsgi.input')

    length = str(ctx.transport.req.get('CONTENT_LENGTH', 0))
    if len(length) == 0:
        logging.warning("Empty Content")
        length = 0
    else:
        length = int(length)
        logging.warning("Content Length %s", length)

    print(istream.read(length))
    print("End _on_wsgi_call")

The server doesn't hang, but the client connection stays open and nothing is returned to the client. You may recognize some of that from __wsgi_input_to_iterable in ./spyne/server/wsgi.py. Maybe I have to jam istream.read(length) into the ctx.in_string and call another method to get it to continue.

arskom / spyne

Strategies for dealing with broken clients? (invalid documents) #664