Closed asherikov closed 1 year ago
@asherikov
Could you provide a copy of the complete script or command you executed that resulted in this issue please? Thanks
Sorry, cannot give you the exact parameters, but it is a pretty standard NTRIP setup, which works for other servers. Reproducible both with pygpsclient, and a simple cli utility based on gnssntripclient. It looks like the server closes connection and the client doesnt handle this situation: note that recv() does not throw on closed connections, but simply returns an empty message. So it should be fixed in pyubx2.
Sorry, cannot give you the exact parameters, but it is a pretty standard NTRIP setup, which works for other servers. Reproducible both with pygpsclient, and a simple cli utility based on gnssntripclient. It looks like the server closes connection and the client doesnt handle this situation: note that recv() does not throw on closed connections, but simply returns an empty message. So it should be fixed in pyubx2.
OK at the moment I'm not able to reproduce with any of the 5 or 6 NTRIP casters I routinely test with, and surprised this hasn't come up before. Happy to beef up exception handling in the SocketStream class (which is in pyubx2, pynmeagps and pyrtcm) - not a biggie - but you may need to assist with the retesting if you're not comfortable sharing your configuration (you can send it to be directly by email if you don't want it in the public domain).
Does this look like a workable fix?:
def _recv(self) -> bool:
"""
Read bytes from socket into internal buffer.
:return: return code (0 = failure, 1 = success)
:rtype: bool
"""
try:
data = self._socket.recv(self._bufsize)
if len(data) == 0:
return False
self._buffer += data
except (OSError, TimeoutError):
return False
return True
Hi @asherikov
Could you try installing and testing the fix-socketstream-thrash
branch and let me know if it resolves the problem - thanks.
pip3 install git+https://github.com/semuconsulting/pyubx2.git@fix-socketstream-thrash
Hi, thanks, this breaks loop in pyubx2, but now it starts looping in gnssntripclient, since it simply retries on failure:
[...]te-packages/pyubx2/socket_stream.py:50 line if len(data) == 0:
[...]te-packages/pyubx2/socket_stream.py:51 line return False
[...]te-packages/pyubx2/socket_stream.py:51 return <= _recv: False
[...]te-packages/pyubx2/socket_stream.py:81 line return b""
[...]te-packages/pyubx2/socket_stream.py:81 return <= read: b''
[...]0/site-packages/pyubx2/ubxreader.py:307 line if len(data) < size: # EOF
[...]0/site-packages/pyubx2/ubxreader.py:308 line raise EOFError()
[...]0/site-packages/pyubx2/ubxreader.py:308 exception ! _read_bytes: (<class 'EOFError'>, EOFError(), <traceback object at 0x7f1dcc9e1880>)
[...]0/site-packages/pyubx2/ubxreader.py:308 return <= _read_bytes: None
[...]0/site-packages/pyubx2/ubxreader.py:145 exception ! read: (<class 'EOFError'>, EOFError(), <traceback object at 0x7f1dcb2bd7c0>)
[...]0/site-packages/pyubx2/ubxreader.py:187 line except EOFError:
[...]0/site-packages/pyubx2/ubxreader.py:188 line return (None, None)
[...]0/site-packages/pyubx2/ubxreader.py:188 return <= read: (None, None)
[...]ages/pygnssutils/gnssntripclient.py:557 line if raw_data is not None:
[...]ages/pygnssutils/gnssntripclient.py:559 line self._send_GGA(ggainterval, output)
[...]ages/pygnssutils/gnssntripclient.py:358 call => _send_GGA(self=<btk_ntrip_client.btk_ntrip_client.SerialNTRIP object at 0x7f1dcb283d00>, ggainterval=2, output=None)
[...]ages/pygnssutils/gnssntripclient.py:367 line if ggainterval != NOGGA:
[...]ages/pygnssutils/gnssntripclient.py:368 line if datetime.now() > self._last_gga + timedelta(seconds=ggainterval):
[...]ages/pygnssutils/gnssntripclient.py:368 return <= _send_GGA: None
[...]ages/pygnssutils/gnssntripclient.py:554 line while not stopevent.is_set():
[...]ages/pygnssutils/gnssntripclient.py:555 line try:
[...]ages/pygnssutils/gnssntripclient.py:556 line raw_data, parsed_data = ubr.read()
[...]0/site-packages/pyubx2/ubxreader.py:126 call => read(self=<pyubx2.ubxreader.UBXReader object at 0x7f1dcb283970>)
[...]0/site-packages/pyubx2/ubxreader.py:139 line flag = True
[...]0/site-packages/pyubx2/ubxreader.py:141 line try:
[...]0/site-packages/pyubx2/ubxreader.py:142 line while flag: # loop until end of valid message or EOF
[...]0/site-packages/pyubx2/ubxreader.py:143 line raw_data = None
[...]0/site-packages/pyubx2/ubxreader.py:144 line parsed_data = None
[...]0/site-packages/pyubx2/ubxreader.py:145 line byte1 = self._read_bytes(1) # read the first byte
[...]0/site-packages/pyubx2/ubxreader.py:296 call => _read_bytes(self=<pyubx2.ubxreader.UBXReader object at 0x7f1dcb283970>, size=1)
[...]0/site-packages/pyubx2/ubxreader.py:306 line data = self._stream.read(size)
[...]te-packages/pyubx2/socket_stream.py:68 call => read(self=<pyubx2.socket_stream.SocketStream object at 0x7f1dcb283ee0>, num=1)
[...]te-packages/pyubx2/socket_stream.py:79 line while len(self._buffer) < num:
[...]te-packages/pyubx2/socket_stream.py:80 line if not self._recv():
[...]te-packages/pyubx2/socket_stream.py:40 call => _recv(self=<pyubx2.socket_stream.SocketStream object at 0x7f1dcb283ee0>)
[...]te-packages/pyubx2/socket_stream.py:48 line try:
[...]te-packages/pyubx2/socket_stream.py:49 line data = self._socket.recv(self._bufsize)
[...]te-packages/pyubx2/socket_stream.py:50 line if len(data) == 0:
[...]te-packages/pyubx2/socket_stream.py:51 line return False
Can I just clarify what, from your perspective, the expected behaviour is here?
gnssntripclient is designed to loop indefinitely until the user explicitly stops it (via Ctrl-C), even if the NTRIP caster is sending no data. Are you looking for some kind of arbitrary 'inactivity timeout' here which halts the program if no data is received after a specified period?
Not quite:
Not quite:
* I can implement inactivity timeout by myself when implementing a client based on gnssntripclient library. * My concern is that in the case of a closed connection: * CPU usage is going nuts until such timeout expires * detection of closed connections would enable me to reconnect without waiting for timeout to expire.
Problem is AFAIK there's no foolproof, cross-platform way of distinguishing between a closed socket connection and an inactive server - in both cases, as you say, socket.recv() simply returns zero data but no exception. One could, for example, attempt to write some arbitrary packets using socket.sendall() at set intervals and see if that fails, but that may cause issues at the server end. And an automated reconnect would obviously need a retry interval and max retry count to avoid getting into a different thrashing situation.
In your specific case, do you know what's causing the NTRIP server to drop the connection?
Ok, thanks, unfortunately I dont have feedback from the other side. I guess the most reasonable thing for me would be to check socket status by myself.
Ok, thanks, unfortunately I dont have feedback from the other side. I guess the most reasonable thing for me would be to check socket status by myself.
ok up entirely to you. happy to pursue further if you can't find a satisfactory solution at your end.
Describe the bug
May get stuck in SocketStream.read() and consume 100% CPU when used from gnssntripclient:
Version: 1.2.29, Ubuntu 22.04
To Reproduce
I suspect this is a server issue, but some sort of failure detection would be nice.