TheThingsArchive / packet_forwarder

Packet forwarder for Linux based gateways
MIT License
84 stars 92 forks source link

Packet forwarder stops sending packets after >12 hours of run #28

Closed egourlao closed 6 years ago

egourlao commented 7 years ago

@h0l0gram noticed that after 740 minutes (~740 minutes running), the packet forwarder build for the Raspberry Pi stopped sending messages to the backend, although the process was still running. A restart of the packet forwarder fixed the issue. This is to be investigated, but could be linked to the fact that tokens for the account server (and the router?) might have to be refreshed.

h0l0gram commented 7 years ago

It happend again. Started GW g1-test at 10:54 today. All seemed to work fine. Then after 13:50 in the linux terminal only status messages:

  INFO Sending status to the network server     CpuPercentage=0.39516914 FrequencyPlan=EU_863_870 Load1=0.27 Load15=0.17 Load5=0.2 MemoryPercentage=13.045552 RXPacketsReceived=365 RXPacketsValid=365 TXPacketsReceived=0 TXPacketsValid=0 Uptime=27m8.613931s
  INFO Sending status to the network server     CpuPercentage=0.39517403 FrequencyPlan=EU_863_870 Load1=0.21 Load15=0.16 Load5=0.19 MemoryPercentage=13.064544 RXPacketsReceived=365 RXPacketsValid=365 TXPacketsReceived=0 TXPacketsValid=0 Uptime=27m23.677096s
  INFO Received uplink packets                  NbPackets=1
  WARN Packets received, but with invalid CRC - ignoring
  INFO Sending status to the network server     CpuPercentage=0.39518353 FrequencyPlan=EU_863_870 Load1=0.16 Load15=0.16 Load5=0.18 MemoryPercentage=13.057792 RXPacketsReceived=366 RXPacketsValid=366 TXPacketsReceived=0 TXPacketsValid=0 Uptime=27m38.738269s
  INFO Received uplink packets                  NbPackets=1
  INFO Received valid packets - sending them to the back-end NbValidPackets=1
  INFO Uplink message transmission successful.  CodingRate=4/5 DataRate=SF12BW125 Frequency=868100000 GatewayID=g1-test Modulation=LORA PayloadSize=22 RSSI=-114 SNR=-15.75
  INFO Sending status to the network server     CpuPercentage=0.3951885 FrequencyPlan=EU_863_870 Load1=0.13 Load15=0.16 Load5=0.17 MemoryPercentage=13.057792 RXPacketsReceived=367 RXPacketsValid=367 TXPacketsReceived=0 TXPacketsValid=0 Uptime=27m54.079429s
  INFO Sending status to the network server     CpuPercentage=0.3951936 FrequencyPlan=EU_863_870 Load1=0.1 Load15=0.16 Load5=0.16 MemoryPercentage=13.057792 RXPacketsReceived=367 RXPacketsValid=367 TXPacketsReceived=0 TXPacketsValid=0 Uptime=28m9.142213s
  INFO Sending status to the network server     CpuPercentage=0.39519858 FrequencyPlan=EU_863_870 Load1=0.08 Load15=0.15 Load5=0.16 MemoryPercentage=13.057792 RXPacketsReceived=367 RXPacketsValid=367 TXPacketsReceived=0 TXPacketsValid=0 Uptime=28m24.199578s
  INFO Sending status to the network server     CpuPercentage=0.39520383 FrequencyPlan=EU_863_870 Load1=0.34 Load15=0.17 Load5=0.21 MemoryPercentage=13.057792 RXPacketsReceived=367 RXPacketsValid=367 TXPacketsReceived=0 TXPacketsValid=0 Uptime=28m39.257196s
  INFO Received uplink packets                  NbPackets=1
  INFO Received valid packets - sending them to the back-end NbValidPackets=1
  INFO Uplink message transmission successful.  CodingRate=4/5 DataRate=SF12BW125 Frequency=868300000 GatewayID=g1-test Modulation=LORA PayloadSize=22 RSSI=-115 SNR=-13.5
  INFO Sending status to the network server     CpuPercentage=0.39520878 FrequencyPlan=EU_863_870 Load1=0.27 Load15=0.17 Load5=0.2 MemoryPercentage=13.061168 RXPacketsReceived=368 RXPacketsValid=368 TXPacketsReceived=0 TXPacketsValid=0 Uptime=28m54.325628s
  INFO Sending status to the network server     CpuPercentage=0.3952143 FrequencyPlan=EU_863_870 Load1=0.37 Load15=0.18 Load5=0.23 MemoryPercentage=13.061168 RXPacketsReceived=368 RXPacketsValid=368 TXPacketsReceived=0 TXPacketsValid=0 Uptime=29m9.388816s
  INFO Sending status to the network server     CpuPercentage=0.3952199 FrequencyPlan=EU_863_870 Load1=0.29 Load15=0.17 Load5=0.22 MemoryPercentage=13.061168 RXPacketsReceived=368 RXPacketsValid=368 TXPacketsReceived=0 TXPacketsValid=0 Uptime=29m24.454637s
  INFO Sending status to the network server     CpuPercentage=0.3952253 FrequencyPlan=EU_863_870 Load1=0.22 Load15=0.17 Load5=0.21 MemoryPercentage=13.061168 RXPacketsReceived=368 RXPacketsValid=368 TXPacketsReceived=0 TXPacketsValid=0 Uptime=29m39.510323s

No messages of kind INFO Received valid packets - sending them to the back-end NbValidPackets=1 anymore

No traffic visible in ttn console anymore.

egourlao commented 7 years ago

@tftelkamp also reported a similar issue regarding Multitech Conduits, after the E&A fair. There probably is indeed an issue worth looking into, regarding hours-long runs of the packet forwarder.

h0l0gram commented 7 years ago

Unfortunately, despite fix #34 this issue is still in v2.0.2. Gateway g1-test stopped sending packets somewhere in the last hours

egourlao commented 7 years ago

After a bit of investigation with @h0l0gram with a build with extra logging, it appears that after a period of time (sometimes 8-9 hours, sometimes a few days), the packet forwarder doesn't get any new packages from the HAL. There's no freezing on the packet forwarder side, it just doesn't get any new messages anymore:

  INFO Sending status to the network server     Altitude=450 CpuPercentage=1.5659744 FrequencyPlan=EU_863_870 Latitude=47.023174 Load1=0.06 Load15=0.09 Load5=0.1 Longitude=8.308729 MemoryPercentage=15.945021 RTT=58 RXPacketsReceived=4209 RXPacketsValid=7489 TXPacketsReceived=0 TXPacketsValid=0 Uptime=44m32.714006s
 DEBUG No packet received from the concentrator for 120 seconds ; uplink routine still safe

...and this in a loop. We'll have to investigate what difference in behaviour with the SX1301 chip/the HAL causes the packet forwarder to receive no messages.

kruisdraad commented 7 years ago

+1, same issue

fanyujiang commented 6 years ago

It's the same problem