Closed packplusplus closed 3 years ago
Dealing with broken clients is just normal business for a soap server.
I always thought that stuff that needs to happen before create_in_document()
needs to happen in the transport layer, because at that stage of request processing, you're effectively dealing with just a byte stream, and juggling bytes around is the job of the transports. So anything you do is going to be transport-specific.
You could add a one-liner to fire an event before create_in_document but you'd have to be doing transport-specific work in it anyway. So I'm not sure that's the best thing to do. Would you agree?
You can have a look at examples/events.py
to see how the 'wsgi_call'
hook could be used to modify docs before xml parsing takes place.
I may be approaching my limitations and I don't want to turn this into a help forum, but I do see value in sharing what I learned.
I'm having trouble working out how I'd manipulate the context object. wsgi_call gets you an object that looks like
WsgiMethodContext(
call_start=1629217054.327723,
call_end=None,
is_closed=False,
app=<spyne.application.Application object at 0x10ae31d30>,
udc=None,
transport=<spyne.server.wsgi.WsgiTransportContext object at 0x10aed62b0>,
outprot_ctx=<spyne.context.ProtocolContext object at 0x10aec5f40>,
inprot_ctx=<spyne.context.ProtocolContext object at 0x10aec5fa0>,
protocol=<spyne.context.ProtocolContext object at 0x10aec5fa0>,
event=<spyne.context.EventContext object at 0x10aed6250>,
aux=None,
method_request_string=None,
files=[],
active=False,
_MethodContext__descriptor=None,
in_string=None,
in_document=None,
in_header_doc=None,
in_body_doc=None,
in_error=None,
in_header=None,
in_object=None,
out_object=None,
out_header=None,
out_error=None,
out_body_doc=None,
out_header_doc=None,
out_document=None,
out_string=None,
out_stream=None,
function=None,
locale=None,
_in_protocol=<spyne.protocol.soap.soap11.Soap11 object at 0x10a5bafd0>,
_out_protocol=<spyne.protocol.soap.soap11.Soap11 object at 0x10ae311f0>,
pusher_stack=[],
frozen=True,
))
If I dig into the transport object, there is a req object, which has the byte stream of the payload in wsgi.input
, but that's a BufferedReader. I've had trouble "materializing" the buffered reader without causing hangs, but this is where I'd point back to my python limitations. Code like this hangs
def _on_wsgi_call(ctx):
print("Start _on_wsgi_call")
istream = ctx.transport.req.get('wsgi.input')
length = str(ctx.transport.req.get('CONTENT_LENGTH', 0))
if len(length) == 0:
logging.warning("Empty Content")
length = 0
else:
length = int(length)
logging.warning("Content Length %s", length)
print(istream.read(length))
print("End _on_wsgi_call")
The server doesn't hang, but the client connection stays open and nothing is returned to the client. You may recognize some of that from __wsgi_input_to_iterable
in ./spyne/server/wsgi.py
. Maybe I have to jam istream.read(length)
into the ctx.in_string and call another method to get it to continue.
I have a client I can't change, which submits a malformed SOAP doc. The client assumes that there's already a namespace called
soap
with the urihttp://schemas.xmlsoap.org/soap/
. Who knows why the original server accepted this. I'm not in a position to judge or fix, only make something bug for bug compatible.Let's demonstrate the behavior in a simplified way.
I had originally gone down the path of trying to use the listeners, like
before_deserialize
, but all of those seem to occur after lxml.XMLID tries to parse the object and throws a fault (https://github.com/arskom/spyne/blob/354723434db322e87a37b98a8a62b731954651e7/spyne/protocol/soap/soap11.py#L112).I ended up monkey patching Soap11's
_parse_xml_string
to do some nasty string manipulation with some regex's to find out if the namespace was specified in the envelope, and if not, modify the incoming string to include it.It makes sense that an invalid XML doc would cause a failure, but I'm wondering if there are other strategies I should be considering instead of this approach. Or perhaps it was the intent of those listeners in the Soap11 object to help cope with bad document and the functionality was lost over time. If it's the former, I'm all ears, if it's the later, I'm not totally sure how I would fix it.
I could see a path where
create_in_document
fires an event that allows you to manipulate the xml string beforectx.in_document
gets a parsed XML object, and I would take a swing a PR for that functionality if anyone sees merit in it.Thoughts?