cyoung / stratux

Aviation weather and traffic receiver based on RTL-SDR.
BSD 3-Clause "New" or "Revised" License
1.04k stars 358 forks source link

message queue overflow - Traffic Disappearing. Stratux not respecting expired leases #814

Open dylanlive opened 3 years ago

dylanlive commented 3 years ago
  1. Stratux version: 1.6r1

  2. Stratux config:

    SDR

    • [ ] single
    • [x] dual

    GPS

    • [x] yes
    • [ ] no type: Internal - Stratux GPYes 2.0 u-blox 8 GPS unit

    AHRS

    • [ ] yes
    • [x] no

    power source: Anker PowerCore II Slim 10000

    usb cable: Anker included USB

  3. EFB app and version: FltPlan Go 4.7.26

    EFB platform: iOS 12.4.5

    EFB hardware: iPad Mini 2

  4. Description of your issue:

Various times I've noticed all traffic disappear for ~2 minutes, and then reappear on FltPlan Go, usually about 60 or so minutes after booting. While reviewing the logs, I see around the time a large amount of message queue overflow errors across various IPs. In actuality, a maximum of 3 devices should be connected.

I noticed in the dhcpd.leases file, some of the IPs had an expired lease. My hypothesis is Stratux is queuing messages for devices that haven't been connected for quite some time. Over a certain amount of time, perhaps the queue interferes with delivering messages to the actual connected devices?

I noticed getDHCPLeases() does not look at the ends property. It seems if it's in the dhcp lease file, it gets added. Perhaps the last line of defense is in refreshConnectedClients() when it tries to dialUDP() but I'm not sure if that returns an err if the client is disconnected.

I've seen this issue comment https://github.com/cyoung/stratux/issues/536#issuecomment-263689784 which makes it appear to be normal. However, I wanted to confirm as based on my best estimation, this is when I noticed traffic disappearing on my iPad.

stratux.log excerpt around the time of the issue: https://gist.github.com/dylanlive/5e63ff55132f88896ab10ddd59c48adf

stratux.log excerpt of a lot of connected IPs - far more than devices in the cockpit

2020/06/23 22:17:05 client connected: 192.168.10.13:4000 (iPad-3).
2020/06/23 22:17:05 client connected: 192.168.10.14:4000 ().
2020/06/23 22:17:05 client connected: 192.168.10.12:4000 ().
2020/06/23 22:17:05 client connected: 192.168.10.10:4000 ().
2020/06/23 22:17:05 client connected: 192.168.10.20:4000 ().
2020/06/23 22:17:05 client connected: 192.168.10.17:4000 ().
2020/06/23 22:17:05 client connected: 192.168.10.15:4000 ().
2020/06/23 22:17:05 client connected: 192.168.10.16:4000 ().
2020/06/23 22:17:05 client connected: 192.168.10.19:4000 ().
2020/06/23 22:17:05 client connected: 192.168.10.18:4000 ().
2020/06/23 22:17:05 client connected: 192.168.10.11:4000 (Dylans-iPad).

dhcpd.leasesexcerpt

lease 192.168.10.13 {
  starts 4 2020/06/04 02:17:45;
  ends 4 2020/06/04 02:19:45;
  tstp 4 2020/06/04 02:19:45;
  cltt 4 2020/06/04 02:17:45;
  binding state free;
  hardware ethernet <removed>;
  uid <removed>;
  client-hostname "iPad-3";
}
lease 192.168.10.17 {
  starts 4 2020/06/04 21:19:07;
  ends 5 2020/06/05 00:39:07;
  tstp 5 2020/06/05 00:39:07;
  cltt 4 2020/06/04 21:19:07;
  binding state free;
  hardware ethernet <removed>
  uid <removed>
}
lease 192.168.10.14 {
  starts 4 2020/06/04 21:25:31;
  ends 5 2020/06/05 00:45:31;
  tstp 5 2020/06/05 00:45:31;
  cltt 4 2020/06/04 21:25:31;
  binding state free;
  hardware ethernet <removed>
  uid <removed>
}
lease 192.168.10.15 {
  starts 4 2020/06/04 21:30:49;
  ends 5 2020/06/05 00:50:49;
  tstp 5 2020/06/05 00:50:49;
  cltt 4 2020/06/04 21:30:49;
  binding state free;
  hardware ethernet <removed>
  uid <removed>
}
lease 192.168.10.16 {
  starts 4 2020/06/04 21:32:37;
  ends 5 2020/06/05 00:52:37;
  tstp 5 2020/06/05 00:52:37;
  cltt 4 2020/06/04 21:32:37;
  binding state free;
  hardware ethernet <removed>
  uid <removed>
}
lease 192.168.10.19 {
  starts 4 2020/06/11 19:03:35;
  ends 4 2020/06/11 22:23:35;
  tstp 4 2020/06/11 22:23:35;
  cltt 4 2020/06/11 19:30:33;
  binding state free;
  hardware ethernet <removed>
  uid <removed>
}
lease 192.168.10.12 {
  starts 4 2020/06/11 22:36:29;
  ends 5 2020/06/12 01:56:29;
  tstp 5 2020/06/12 01:56:29;
  cltt 4 2020/06/11 22:36:29;
  binding state free;
  hardware ethernet <removed>
  uid <removed>
}
lease 192.168.10.20 {
  starts 4 2020/06/18 21:20:57;
  ends 5 2020/06/19 00:40:57;
  tstp 5 2020/06/19 00:40:57;
  cltt 4 2020/06/18 21:20:57;
  binding state free;
  hardware ethernet <removed>;
  uid <removed>;
}
lease 192.168.10.18 {
  starts 2 2020/06/23 21:59:03;
  ends 3 2020/06/24 01:06:18;
  tstp 3 2020/06/24 01:06:18;
  cltt 2 2020/06/23 21:59:05;
  binding state free;
  hardware ethernet <removed>
  uid <removed>
}
lease 192.168.10.11 {
  starts 2 2020/06/23 23:07:55;
  ends 3 2020/06/24 02:27:55;
  tstp 3 2020/06/24 02:27:55;
  cltt 2 2020/06/23 23:07:55;
  binding state free;
  hardware ethernet <removed>
  uid <removed>
}
lease 192.168.10.10 {
  starts 3 2020/06/24 03:44:33;
  ends 3 2020/06/24 07:04:33;
  tstp 3 2020/06/24 07:04:33;
  cltt 3 2020/06/24 03:44:33;
  binding state active;
  next binding state free;
  rewind binding state free;
  hardware ethernet <removed>
  uid <removed>
  client-hostname "Dylans-Air";
}
lease 192.168.10.21 {
  starts 3 2020/06/24 03:48:05;
  ends 3 2020/06/24 07:08:05;
  tstp 3 2020/06/24 07:08:05;
  cltt 3 2020/06/24 03:48:05;
  binding state active;
  next binding state free;
  rewind binding state free;
  hardware ethernet <removed>
  uid <removed>
}
server-duid <removed>
}
cyoung commented 3 years ago

@dylanlive -

I noticed in the dhcpd.leases file, some of the IPs had an expired lease. My hypothesis is Stratux is queuing messages for >devices that haven't been connected for quite some time. Over a certain amount of time, perhaps the queue interferes with >delivering messages to the actual connected devices?

I noticed getDHCPLeases() does not look at the ends property. It seems if it's in the dhcp lease file, it gets added. Perhaps >the last line of defense is in refreshConnectedClients() when it tries to dialUDP() but I'm not sure if that returns an err if the >client is disconnected.

That's right. The ends property may or may not be valid relative to the system time, since the system time depends on obtaining the real time from the GPS unit.

The message queues are allocated on a per-client (DHCP lease) basis. So if traffic is disappearing on you, it's likely not related to message queues overflowing for the dead leases. Best bet would be to modify the leases file and repeat the test.

dylanlive commented 3 years ago

@cyoung Will do and report back. I'm still suspicious because my traffic issues seemed to start when those message queue overflow log messages started. Might try to record the stratux dashboard to rule out 3rd party app issue. Maybe it's gps related too