Open g7gpr opened 1 month ago
I think the log is doing its job and capturing some sort of error. Any chance you could share the whole log? Is it associated with a video stream disconnect or something?
2024/09/25 09:10:15-INFO-EventMonitor-line:2144 - Next EventMonitor run : 01:40:15 UTC; 30.0 minutes from now
2024/09/25 09:10:15-INFO-EventMonitor-line:2148 - Next Capture start : 10:38:49 UTC
2024/09/25 09:40:16-INFO-EventMonitor-line:2144 - Next EventMonitor run : 02:10:16 UTC; 30.0 minutes from now
2024/09/25 09:40:16-INFO-EventMonitor-line:2148 - Next Capture start : 10:38:49 UTC
2024/09/25 09:52:11-INFO-Logger-line:44 - Socket exception: Connection reset by peer (104)
2024/09/25 10:10:18-INFO-EventMonitor-line:2144 - Next EventMonitor run : 02:40:18 UTC; 30.0 minutes from now
2024/09/25 10:10:18-INFO-EventMonitor-line:2148 - Next Capture start : 10:38:49 UTC
2024/09/25 10:40:20-INFO-EventMonitor-line:2144 - Next EventMonitor run : 03:10:20 UTC; 30.0 minutes from now
2024/09/25 10:40:20-INFO-EventMonitor-line:2148 - Next Capture start : 10:38:49 UTC
Not capturing at the moment, unfortunately I can't download the whole log at the moment
oh, interesting. It's no even capturing...
It could be this bit: https://github.com/CroatianMeteorNetwork/RMS/blob/4ab05b037ee5ada27cd7df1797e812b7d47b3a53/RMS/Formats/ObservationSummary.py#L194-L217
Maybe, wrap it in a try-except block and add a timeout?
def timestampFromNTP(addr='0.us.pool.ntp.org'):
REF_TIME_1970 = 2208988800 # Reference time
client = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
# Set the socket timeout (e.g., 5 seconds)
client.settimeout(5)
try:
data = b'\x1b' + 47 * b'\0'
client.sendto(data, (addr, 123))
data, address = client.recvfrom(1024)
if data:
t = struct.unpack('!12I', data)[10]
t -= REF_TIME_1970
return t
except socket.error as e:
print("Socket error: {}".format(e))
return None
finally:
client.close()
Unlikely, that code has no business running at this stage.
Yes, you are right. I see that now. Sorry.
Maybe something around her?: https://github.com/CroatianMeteorNetwork/RMS/blob/4ab05b037ee5ada27cd7df1797e812b7d47b3a53/RMS/EventMonitor.py#L1292
That code executes at the start of an EventMonitor run, which is not yet due.
So the error is occurring while the program is in this loop: https://github.com/CroatianMeteorNetwork/RMS/blob/4ab05b037ee5ada27cd7df1797e812b7d47b3a53/RMS/EventMonitor.py#L2127
Which calls this function: https://github.com/CroatianMeteorNetwork/RMS/blob/4ab05b037ee5ada27cd7df1797e812b7d47b3a53/RMS/EventMonitor.py#L2084
Which calls: https://github.com/CroatianMeteorNetwork/RMS/blob/4ab05b037ee5ada27cd7df1797e812b7d47b3a53/RMS/EventMonitor.py#L1292
Maybe warp it in a with
and catch socket error?:
try:
if sys.version_info[0] < 3:
with urllib2.urlopen(self.syscon.event_monitor_webpage) as response:
web_page = response.read().splitlines()
else:
with urllib.request.urlopen(self.syscon.event_monitor_webpage) as response:
web_page = response.read().decode("utf-8").splitlines()
except (socket.error, urllib2.URLError, urllib.error.URLError) as e:
log.error("Socket or URL error in getEventsfromWebPage: {}".format(e))
return events
except Exception as e:
log.error("Unexpected error in getEventsfromWebPage: {}".format(e))
return events
Error was raised at 09:52:11, that code ran successfully at 09:40:15 and was not scheduled to run again until 10:10:18. Why do you think it is that code that had the exception?
A likely cause of the error is that a socket is being opened but not properly closed by the client. If the client fails to close the socket after completing its operation, the server may eventually close the connection after a period of inactivity, which is what the “Connection reset by peer” message suggests. This typically happens when the server forcefully terminates an idle or incomplete connection.
It’s also important to note that there can be a delay between when the socket is opened and when the server closes it. The message may not appear immediately after the socket is opened but rather after the server detects inactivity or hits a timeout, which can occur several minutes later.
It'd be worth trying using a with statement when opening the socket ensures that the connection is properly closed on the client side once the operation is complete.
Also, I think the code as it is keeps opening a new socket every 30 minutes without ever closing them.
https://github.com/CroatianMeteorNetwork/RMS/blob/4ab05b037ee5ada27cd7df1797e812b7d47b3a53/RMS/Logger.py#L44
On prerelease Pi4, seeing sporadic messages referring to this line.
2024/09/25 09:52:11-INFO-Logger-line:44 - Socket exception: Connection reset by peer (104)
I don't understand what is happening here, apologies for this not being a very helpful submission.