christgau / wsdd

A Web Service Discovery host daemon.
MIT License
814 stars 98 forks source link

xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 547 #81

Closed coder543 closed 2 years ago

coder543 commented 3 years ago

Full backtrace reproduced on latest master revision:

2021-01-25 07:39:30,045:wsdd ERROR(pid 16911): error in main loop
Traceback (most recent call last):
  File "/usr/local/bin/wsdd.py", line 1678, in main
    key.data.handle_request(key)
  File "/usr/local/bin/wsdd.py", line 184, in handle_request
    handler.handle_request(msg, address)
  File "/usr/local/bin/wsdd.py", line 699, in handle_request
    reply = self.handle_message(msg, self.mch, address)
  File "/usr/local/bin/wsdd.py", line 344, in handle_message
    tree = ElementTree.fromstring(msg)
  File "/usr/local/lib/python3.8/xml/etree/ElementTree.py", line 1320, in XML
    parser.feed(text)
  File "<string>", line None
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 547

I'm running a TrueNAS machine with SMB enabled, and I noticed that /var/log/messages was full of the error mentioned in the title. I poked around a little and discovered what seems to be the underlying issue...

b'<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/addressing" xmlns:wsd="http://schemas.xmlsoap.org/ws/2005/04/discovery"><soap:Header><wsa:To>urn:schemas-xmlsoap-org:ws:2005:04:discovery</wsa:To><wsa:Action>http://schemas.xmlsoap.org/ws/2005/04/discovery/Probe</wsa:Action><wsa:MessageID>urn:uuid:92A72C16-DD1C-4E9F-BA31-66B0F948103F</wsa:MessageID></soap:Header><soap:Body><wsd:Probe><wsd:Types>wsdp:Device wscn:ScanDeviceType</wsd:Types></wsd:Probe></soap:Body></soap:Envelope>\x00'

There is a trailing null character at the end of the XML, which doesn't seem to be handled correctly.

On the original version of wsdd.py that came with TrueNAS (some older version), I was able to fix the problem by simply removing the trailing null character from the msg (returned by recvfrom(WSD_MAX_LEN)) before it was handed into the wsd_handle_message function. Once I confirmed the issue still exists with the latest master, I just put my modified version of the original script file back in place and didn't attempt to reproduce my fix in the new code, but I assume the same fix would work here.

christgau commented 3 years ago

Thanks for sharing your observation

There is a trailing null character at the end of the XML, which doesn't seem to be handled correctly.

I would put it the other way: A trailing null character is not supposed to be part of XML. Thus, it is absolutely valid to raise an exception when a buffer containing such a character is received and an attempt is made to interpret it as XML. In that sense the behavior is correct. On the other hand, someone must be sending such an invalid message. Do you have other WSD (client) implementations in the network, such as wsdd in discovery mode (😮). I haven't seen Windows doing such invalid things. Maybe you could give wireshark/tcpdump a chance to find out the misbehaving client.

What might be discussed is to try to lower the verbosity of the log when such a garbage message is received.

coder543 commented 3 years ago

I honestly don't know where that invalid message is coming from at the moment. I haven't had time to try to dig into anything -- this is just a normal home network, with a Windows 10 desktop, a Raspberry Pi, and a few Macs (excluding the aforementioned TrueNAS box and various phones / tablets). There isn't any intentionally experimental software running anywhere, and I wasn't even aware of WSD as a protocol until today -- so as far as I know, WSDD isn't running anywhere else.

I agree in concept that it seems like a problem with the message sender, but it also seems like this code being written to robustly ignore trailing nulls after the message contents wouldn't necessarily be a bad thing. Null characters are always invalid in XML.

Pure speculation, but I wonder if a C-like string buffer (with a trailing null to indicate the end of the string) is being converted into a Python bytestring in a weird way that has an off-by-one error sometimes, which would lead to the inclusion of that trailing null into the Python bytestring? I haven't spent any time studying this code beyond the initial diagnosis that led to the opening of this ticket, so I really don't know.

christgau commented 3 years ago

WSDD isn't running anywhere else.

Phew. Then it's at least not wsdd that is sending this invalid message. 😉

I agree in concept that it seems like a problem with the message sender, but it also seems like this code being written to robustly ignore trailing nulls after the message contents wouldn't necessarily be a bad thing.

True, but where does robustness start and where does it end? I mean we could have a trailing 'x' instead of null. Should wsdd be responsible of removing that one as well? In the end it is working around errors of some other wrong code. That one should be fixed first. At least the used XML parser is smart enough to detect the erroneous data. From my point of view that is robust enough already.

Null characters are always invalid in XML.

True.

Pure speculation, but I wonder if a C-like string buffer (with a trailing null to indicate the end of the string) is being converted into a Python bytestring in a weird way that has an off-by-one error sometimes, which would lead to the inclusion of that trailing null into the Python bytestring?

Concerning wsdd, upon UDP packet reception the whole packet's content is received as bytes: https://github.com/christgau/wsdd/blob/231a7b95e2f004bd42e3f1f14d508b91e93a4346/src/wsdd.py#L181-L185 and then passed through to ElementTree's fromstring https://github.com/christgau/wsdd/blob/231a7b95e2f004bd42e3f1f14d508b91e93a4346/src/wsdd.py#L340-L344 Thus there is no explicit interpretation of the received data by wsdd. Consequently, the null character appears to be part of the original message.

I suspect it is the other way around: Somewhere an XML message is constructed and stored in a C string. The network message then erroneously contains the terminating null character due to an off-by-one error you mentioned. That's my hypothesis on this observation and therefore I suggested to have a look into a network traffic dump. This may verify whether the trailing null is already within an UDP packet. From the log you posted I do not think the data comes from a TCP connection. In addition, the traffic dump could also identify the ("malicious") sending host and, ultimately, help to spot the sending software with the suspected off-by-one error.

christgau commented 2 years ago

Since cffedb2b, a possible exception raised during string parsing is handled (eaten) by wsdd. As a consequence, no exception trace is printed anymore. That change does not solve this particular issue, but wsdd can hardly do anything against invalid inputs, even if they look very close too valid input. I am therefore closing this issue.