xoseperez / xbee2mqtt

GNU General Public License v3.0
23 stars 16 forks source link

Xbee "port" packets with null bytes cause trouble #2

Open nbertram opened 8 years ago

nbertram commented 8 years ago

Hi,

Love this project! Saved me a bit of time and does everything I need.

I've just finished tracking down a bug that caused it to only detect a custom "port" parameter from a remote Arduino on the first packet, due to sending a null-terminated char string through the XBee using API mode to build a TX packet. Because this null comes after the newline, it stays in the buffer until another newline is received, becoming prepended to the next string to come in (causing a null-byte-prefixed port name to be parsed), so suddenly the port names stop matching the routes dict in a hard to diagnose way.

While I've fixed my Arduino to omit the null byte terminating the string, a fix that could be incorporated at the receiver end is:

diff --git a/libs/xbee_wrapper.py b/libs/xbee_wrapper.py
index 172a194..50c2adf 100755
--- a/libs/xbee_wrapper.py
+++ b/libs/xbee_wrapper.py
@@ -85,7 +85,7 @@ class XBeeWrapper(object):

             # Some streams arrive split in different packets
             # we buffer the data until we get an EOL
-            self.buffer[address] = self.buffer.get(address,'') + packet['data']
+            self.buffer[address] = self.buffer.get(address,'') + packet['data'].rstrip('\0')
             count = self.buffer[address].count('\n')
             if (count):
                 lines = self.buffer[address].splitlines()

Up to you whether you think this is a reasonable thing to do, but it certainly shouldn't break any ASCII payloads that the program supports, even on a multi-packet payload.

Alternatively, the parser could throw out any extraneous stuff left in the buffer after extracting the lines, though I can see how that might throw away the next payload if it's still receiving, so not ideal.

Thanks again

xoseperez commented 8 years ago

Hey! Thanks for debugging that! But, wouldn't it be better to lstrip null-bytes before line processing?

diff --git a/libs/xbee_wrapper.py b/libs/xbee_wrapper.py
index 172a194..bf95387 100755
--- a/libs/xbee_wrapper.py
+++ b/libs/xbee_wrapper.py
@@ -94,7 +94,7 @@ class XBeeWrapper(object):
                 except:
                     self.buffer[address] = ''
                 for line in lines[:count]:
-                    line = line.rstrip()
+                    line = line.rstrip().lstrip('\0')
                     try:
                         port, value = line.split(':', 1)
                     except:
@@ -162,4 +162,3 @@ class XBeeWrapper(object):
             pass

         return devices
-
nbertram commented 8 years ago

I think the right approach is to not allow the null-byte into the buffer in the first place actually, since it's really part of the packet decoding more than the buffer parser. The null would only ever occur as the last byte of each packet (well, unless someone used the xbee in transparent mode and happened to send a long null-terminated string, but even then it would probably be the last byte of a payload).

By moving it into the parsing part, it's acknowledging that parts of the previous payload have formed part of the next payload, which is a bit dirty. There's no real way to know when the payload has finished, unless you linked up the last newline with the end of a packet, then cleared the buffer at that point. That again would only work for transmitting xbees in API mode, where the next payload would be guaranteed to be in a different packet...

Of course you could just go nuclear and remove null bytes anywhere in the incoming data, since they're a C-ism that means nothing to python anyway. The payload is never binary, so you're not going to break anything by doing that. In fact, any non-ASCII character should probably be banned from the payload (excluding the newline) if you were going to take that approach, as other characters like a \r could also potentially get into the buffer. You'd have to make a call on whether to support utf-8 data explicitly though...