Closed carstenandrich closed 6 months ago
Hi @carstenandrich,
So the purpose of the UBXReader.read() method is to read entire UBX, NMEA or RTCM3 messages from any binary data stream supporting a read(n) method. If it is unable to read such a message, it will return a value of (None, None)
. The circumstances in which it does this very much depend on the behaviour of the stream class itself and whether it provides native buffering or timeout functionality.
Your SlowReader
issue can be simply resolved with the addition of an io.BufferedReader
wrapper around the datastream:
print("# parsing silently fails, because SlowReader returns less bytes than requested")
for raw_data, parsed_data in pyubx2.UBXReader(SlowReader(ubx)):
print(parsed_data)
print()
print("# ... but parsing works if the slow data stream is buffered...")
for raw_data, parsed_data in pyubx2.UBXReader(io.BufferedReader(SlowReader(ubx))):
print(parsed_data)
# parsing silently fails, because SlowReader returns less bytes than requested
read(1) -> b'\xb5'
read(1) -> b'b'
read(4) -> b'\x06\x8b4\x00'
read(54) -> b'\x01\x07\x00\x00\x05\x00R\x10\x01\x06\x00R\x10\x00\x07\x00'
# ... but parsing works if the slow data stream is buffered...
<UBX(CFG-VALGET, version=1, layer=7, position=0, CFG_UART1_ENABLED=1, CFG_UART1_REMAP=0, CFG_0x10520007=b'\x00', CFG_UART1_STOPBITS=1, CFG_UART1_DATABITS=0, CFG_UART1_PARITY=0, CFG_0x20520008=b'\x00', CFG_0x20520009=b'\x01', CFG_UART1_BAUDRATE=38400)>
You would, of course, see the same behaviour using a standard Serial stream with a GNSS device outputting messages at the default rate of around 1Hz, e.g.
from serial import Serial
from pyubx2 import UBXReader
with Serial("/dev/ttyACM1", 38400, timeout=3 as stream:
ubr = UBXReader(stream)
i = 0
for raw_data, parsed_data in ubr:
print(i, parsed_data)
i += 1
Setting timeout=3
will produce a continuous data stream; setting timeout=0.1
will terminate silently after the initial input buffer is cleared. The resolution is this case is simply to increase the timeout value.
More than happy to update the documentation to clarify this, but not convinced an automatic Exception
is called for in these circumstances.
By way of illustration of an alternative approach (where I'm dealing with multiple data stream classes), have a look at the stream_handler._readloop()
function in PyGPSClient:
https://github.com/semuconsulting/PyGPSClient/blob/master/src/pygpsclient/stream_handler.py#L219
Have a look at https://github.com/semuconsulting/pyubx2/commit/99e9013004c3ba127bfc1c55b78c10ced5f5c772 and let me know if it addresses your concerns. You can use the quitonerror
flag to either ignore the error (0), log the error and continue (1) or terminate with an exception (2).
Traceback (most recent call last):
File "/Users/steve/Library/Python/3.12/lib/python/site-packages/pyubx2/ubxreader.py", line 157, in read
(raw_data, parsed_data) = self._parse_ubx(bytehdr)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/steve/Library/Python/3.12/lib/python/site-packages/pyubx2/ubxreader.py", line 228, in _parse_ubx
byten = self._read_bytes(leni + 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/steve/Library/Python/3.12/lib/python/site-packages/pyubx2/ubxreader.py", line 312, in _read_bytes
raise UBXStreamError(
pyubx2.exceptions.UBXStreamError: Serial stream terminated unexpectedly. 54 bytes requested, 16 bytes returned.
Thanks! Just tested with the RC-1.2.41 branch and tt does address my concerns. When you're new to the pyubx2 API this really is a footgun when not using pyserial or parsing from a file. I personally strongly disagree with pyserial's choice of read()
only returning when the buffer is full or the timeout expires, which enforces inefficient patterns like reading a single byte per syscall to parse variable length data like NMEA sentences.
The following is a minimal working example for interfacing a ZED-F9P on Linux:
import io
import os
import pyubx2
import termios
def open_tty(path: str) -> io.FileIO:
fd = os.open(path, os.O_RDWR | os.O_CLOEXEC | os.O_NOCTTY)
attr = termios.tcgetattr(fd)
#[iflag, oflag, cflag, lflag, ispeed, ospeed, cc]
attr[0] = 0
attr[1] = 0
attr[2] = termios.CS8 | termios.CLOCAL | termios.CREAD
attr[3] = 0
attr[4] = termios.B115200
attr[5] = termios.B115200
# configure blocking read
attr[6] = [0] * len(attr[6])
attr[6][termios.VMIN] = 1
attr[6][termios.VTIME] = 0
termios.tcsetattr(fd, termios.TCSAFLUSH, attr)
return open(fd, "r+b", buffering=0)
with open_tty("/dev/ttyACM0") as rtk:
for offset in range(0, 256, 64):
rtk.write(pyubx2.UBXMessage.config_poll(7, offset, [0x0fff0000]).serialize())
while buf := rtk.read(4096):
# do additional stuff with buf
pass
# parse UBX messages
for raw, msg in pyubx2.UBXReader(io.BytesIO(buf), protfilter=7):
print(msg)
Without your fix it yielded me this, causing significant head scratching until I figured out the root cause:
<UBX(ACK-ACK, clsID=CFG, msgID=CFG-VALGET)>
<UNKNOWN PROTOCOL(header=b'$t')>
<UBX(ACK-ACK, clsID=CFG, msgID=CFG-VALGET)>
<UNKNOWN PROTOCOL(header=b'$\x00')>
<UBX(ACK-ACK, clsID=CFG, msgID=CFG-VALGET)>
<UBX(ACK-ACK, clsID=CFG, msgID=CFG-VALGET)>
With the optionally raised Exception I can now simply read from the serial port again to append to the buffer and subsequently try UBXReader() again.
The main reason I cannot simply use
for raw, msg in pyubx2.UBXReader(io.BufferedReader(rtk), protfilter=7):
print(msg)
is that I have to embed UBXReader
in an asyncio coroutine.
Hi @carstenandrich,
Ok glad it was a result. You'll appreciate that pyubx2 and its companion libraries are solely concerned with platform-agnostic protocol parsing, so I deliberately leave the intricacies of serial and socket I/O to external libraries, of which pySerial (for all its foibles) is overwhelmingly the most popular. So it's probably fair to say my streaming test cases are somewhat blinkered in this direction.
Quick observation on your code snippet above, which I appreciate may only be illustrative. Sending CFG POLL requests to a UBX receiver and then reading the input stream in a linear sequence is not going to be bomb-proof in all circumstances, though you may 'get away with it' at 115200 baud and a nominal navigation interval. A more robust model would be to set up a concurrent read process (using whichever concurrent/parallel model you want to work with) first and then send the polls. Likewise dealing with CFG SET commands and their acknowledgements.
You'll appreciate that pyubx2 and its companion libraries are solely concerned with platform-agnostic protocol parsing, so I deliberately leave the intricacies of serial and socket I/O to external libraries [...]
Absolutely agree with that. Just wanted to get this fixed upstream for the sake of potential other non-pyserial-users, so they won't stumble over the same pitfall.
Quick observation on your code snippet above, which I appreciate may only be illustrative.
Thanks. Of course this is just an MWE to illustrate the issue at hand.
Reopening, because I just stumbled over a regression. UBXReader(stream, quitonerror=0|1)
now returns a str
on error:
with open_tty("/dev/ttyACM0") as rtk:
for offset in range(0, 256, 64):
rtk.write(pyubx2.UBXMessage.config_poll(7, offset, [0x0fff0000]).serialize())
for raw, msg in pyubx2.UBXReader(rtk, quitonerror=0, protfilter=7):
print(type(msg), msg)
print(msg.identity)
<class 'pyubx2.ubxmessage.UBXMessage'> <UBX(ACK-ACK, clsID=CFG, msgID=CFG-VALGET)>
ACK-ACK
<class 'str'> Serial stream terminated unexpectedly. 326 bytes requested, 122 bytes returned.
Traceback (most recent call last):
File ".../pyubx2_issue.py", line 33, in <module>
print(msg.identity)
^^^^^^^^^^^^
AttributeError: 'str' object has no attribute 'identity'
As per README.md, it should only ever return a pyubx2.*Message
:
Individual input UBX, NMEA or RTCM3 messages can then be read using the UBXReader.read() function, which returns both the raw binary data (as bytes) and the parsed data (as a UBXMessage, NMEAMessage or RTCMMessage object, via the parse() method).
Reopening, because I just stumbled over a regression.
UBXReader(stream, quitonerror=0|1)
now returns astr
on error:
This is actually expected (though not documented!) behaviour.
Easily fixed, but I'll need to do a quick impact assessment on this as I suspect there may be some lazy programming in PyGPSClient which utilises the str
in the parsed field so it can display 'ignored' errors on the console. In the short term, check raw is not None
before any further processing as a quick workaround.
Alright. I'll add a bunch of type checks. Just wanted to let you know that #150 possibly breaks API compatibility, again for the sake of other users out there. I'd suggest to update the README and docstring, because neither mention that as of right now.
Alright. I'll add a bunch of type checks. Just wanted to let you know that #150 possibly breaks API compatibility, again for the sake of other users out there. I'd suggest to update the README and docstring, because neither mention that as of right now.
No, it needs fixing as it was a lazy solution (and this was what the ErrorHandler callback was for). There are no other downstream impacts. Assume behaviour as per docs once RC 1.2.41 is merged.
EDIT - should be sorted in https://github.com/semuconsulting/pyubx2/pull/150/commits/8fa0bcfd27c7e3f4edb7d6805520bc9482ae6fac
Works for me, but as commented in 8fa0bcf, I'd strongly recommend to use Python's built-in logging module instead of printing to stdout.
Describe the bug
Using
pyubx2.UBXReader(stream)
with astream.read(n)
that returns less thann
bytes will silently terminate parsing, even if subsequentstream.read()
calls will eventually cough up the bytes.To Reproduce
Run this minimal working example:
Reading from the
FastReader
works as expected:But reading from the
SlowReader
fails:Expected Behaviour
As per the README.md
there are no requirements for the
stream.read()
method. ThereforeUBXReader
should not expectread(n)
to return exactlyn
bytes. Returning less bytes than requested does not indicate EOF. Onlyread(n) -> 0
withn>0
indicates EOF.If this behavior is expected and required by
UBXReader
, then the documentation should be amended accordingly and an exception should be raised if the stream class misbehaves anyway.Additional context
Took me a while to get to the root of this (missing some messages in my parsing results), because I was looking at
UBXReader.__next__()
'sraw_data
return value instead of the actual raw data I was feeding into theUBXReader
. Raising an exception would really have helped me to figure it out immediately.