genodelabs / genode

Genode OS Framework
https://genode.org/
Other
1.07k stars 252 forks source link

nic_router.run: sporadic failures due to UDP-socket creation getting stuck #3992

Closed m-stein closed 1 year ago

m-stein commented 3 years ago

Sometimes, the nic_router test fails without errors. It turned out that normally, the failing test is test 2 with one UDP client/server pair. The test seems to get stuck during or shortly after initialization of the two libc components. I've created a recipe for the depot autopilot and stripped it down to test 2 in order to achieve better reproducability. This is the log output when it fails:

Genode 20.11-81-g34fd2c59f9 <local changes>
720 MiB RAM and 63253 caps assigned to init
[init -> vfs] tar archive 'depot.tar' local at 0x162000, size is 5345280
[init -> depot_query] Warning: attempt to handle the same signal context twice (nested)
[init -> depot_autopilot] 
[init -> depot_autopilot] --- Run "test-nic_router" (max 10 sec) ---
[init -> depot_autopilot] 
[init -> depot_autopilot] 1.665 [init -> nic_router] [uplink] static IP config: interface 10.0.2.55/24, gateway 10.0.2.1 P2P 0
[init -> depot_autopilot] 1.666 [init -> nic_router] [uplink] NIC sessions: 0
[init -> depot_autopilot] 1.670 [init -> nic_router] [t1_d1] static IP config: interface 10.0.98.33/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.706 [init -> nic_bridge] --- NIC bridge started (mac=01:02:03:04:05:06) ---
[init -> depot_autopilot] 1.707 [init -> nic_router] [t1_d1] NIC sessions: 0
[init -> depot_autopilot] 1.710 [init -> nic_router] [t1_d2] static IP config: interface 10.0.99.33/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.711 [init -> nic_router] [t1_d2] NIC sessions: 0
[init -> depot_autopilot] 1.711 [init -> nic_router] [t2_d1] static IP config: interface 18.17.16.14/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.712 [init -> nic_router] [t2_d1] NIC sessions: 0
[init -> depot_autopilot] 1.714 [init -> nic_router] [t3_d1] static IP config: interface 217.13.192.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.714 [init -> nic_router] [t3_d1] NIC sessions: 0
[init -> depot_autopilot] 1.716 [init -> nic_router] [t4_d1] static IP config: interface 192.168.1.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.717 [init -> nic_router] [t4_d1] NIC sessions: 0
[init -> depot_autopilot] 1.748 [init -> nic_router] [t5_d1] static IP config: interface 192.168.2.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.749 [init -> nic_router] [t5_d1] NIC sessions: 0
[init -> depot_autopilot] 1.750 [init -> nic_router] [t6_d1] static IP config: interface 100.200.0.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.750 [init -> nic_router] [t6_d1] NIC sessions: 0
[init -> depot_autopilot] 1.751 [init -> nic_router] [t7_d1] static IP config: interface 100.200.0.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.751 [init -> nic_router] [t7_d1] NIC sessions: 0
[init -> depot_autopilot] 1.752 [init -> nic_router] [t7_d2] static IP config: interface 10.13.78.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.753 [init -> nic_router] [t7_d2] NIC sessions: 0
[init -> depot_autopilot] 1.938 [init -> nic_bridge] vmac = 02:02:02:02:02:00 ip = 10.0.2.55
[init -> depot_autopilot] 1.967 [init -> nic_bridge] vmac = 02:02:02:02:02:01 ip = 10.0.2.212
[init -> depot_autopilot] 1.978 [init -> nic_router] [uplink] NIC sessions: 1
[init -> depot_autopilot] 1.999 [init -> t2_d0_c1_udp] lwIP Nic interface down
[init -> depot_autopilot] 2.015 [init -> t2_d0_c1_udp] lwIP Nic interface up address=10.0.2.212 netmask=0.0.0.0 gateway=0.0.0.0
[init -> depot_autopilot] 2.062 [init -> nic_router] [t2_d1] NIC sessions: 1
[init -> depot_autopilot] 2.075 [init -> report_rom] report 'nic_router -> state'
[init -> depot_autopilot] 2.076 [init -> report_rom]   <state>
[init -> depot_autopilot] 2.076 [init -> report_rom]   <domain name="t1_d1" rx_bytes="0" tx_bytes="0" ipv4="10.0.98.33/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 2.077 [init -> report_rom]   <domain name="t1_d2" rx_bytes="0" tx_bytes="0" ipv4="10.0.99.33/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 2.077 [init -> report_rom]   <domain name="t2_d1" rx_bytes="0" tx_bytes="0" ipv4="18.17.16.14/24" gw="0.0.0.0">
[init -> depot_autopilot] 2.078 [init -> report_rom]   <interface label="t2_d1_s1_udp -> lwip" link_state="true"/>
[init -> depot_autopilot] 2.078 [init -> report_rom]   </domain>
[init -> depot_autopilot] 2.079 [init -> report_rom]   <domain name="t3_d1" rx_bytes="0" tx_bytes="0" ipv4="217.13.192.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 2.079 [init -> report_rom]   <domain name="t4_d1" rx_bytes="0" tx_bytes="0" ipv4="192.168.1.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 2.080 [init -> report_rom]   <domain name="t5_d1" rx_bytes="0" tx_bytes="0" ipv4="192.168.2.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 2.081 [init -> report_rom]   <domain name="t6_d1" rx_bytes="0" tx_bytes="0" ipv4="100.200.0.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 2.081 [init -> report_rom]   <domain name="t7_d1" rx_bytes="0" tx_bytes="0" ipv4="100.200.0.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 2.082 [init -> report_rom]   <domain name="t7_d2" rx_bytes="0" tx_bytes="0" ipv4="10.13.78.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 2.096 [init -> report_rom]   <domain name="uplink" rx_bytes="0" tx_bytes="0" ipv4="10.0.2.55/24" gw="10.0.2.1">
[init -> depot_autopilot] 2.097 [init -> report_rom]   <interface label="" link_state="true"/>
[init -> depot_autopilot] 2.098 [init -> report_rom]   </domain>
[init -> depot_autopilot] 2.099 [init -> report_rom]   </state>
[init -> depot_autopilot] 2.103 [init -> t2_d1_s1_udp] lwIP Nic interface down
[init -> depot_autopilot] 2.105 [init -> t2_d1_s1_udp] lwIP Nic interface up address=18.17.16.15 netmask=0.0.0.0 gateway=0.0.0.0
[init -> depot_autopilot] 2.109 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:01 18.17.16.15 > 00:00:00:00:00:00 18.17.16.15 cmd 1
[init -> depot_autopilot] 2.135 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > ff:ff:ff:ff:ff:ff IPV4 18.17.16.15 > 255.255.255.255 UDP 68 > 67 DHCP 02:02:02:02:02:01 1
[init -> depot_autopilot] 2.194 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:00 18.17.16.14 > ff:ff:ff:ff:ff:ff 18.17.16.15 cmd 1
[init -> depot_autopilot] 2.207 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:00 18.17.16.14 > ff:ff:ff:ff:ff:ff 18.17.16.15 cmd 1
[init -> depot_autopilot] 2.208 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 ARP 02:02:02:02:02:01 18.17.16.15 > 02:02:02:02:02:00 18.17.16.14 cmd 2
[init -> depot_autopilot] 2.210 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49218 > 1 
[init -> depot_autopilot] 2.211 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49218 > 1 
[init -> depot_autopilot] 2.223 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 ARP 02:02:02:02:02:01 18.17.16.15 > 02:02:02:02:02:00 18.17.16.14 cmd 2
[init -> depot_autopilot] 2.224 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 ICMP 3 3
[init -> depot_autopilot] 2.225 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 ICMP 3 3
[init -> depot_autopilot] 2.228 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 ICMP 3 3
[init -> depot_autopilot] 2.233 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 ICMP 3 3

A successful run, at the other hand, would look like this:

Genode 20.11-81-g34fd2c59f9 <local changes>
720 MiB RAM and 63253 caps assigned to init
[init -> vfs] tar archive 'depot.tar' local at 0x162000, size is 5345280
[init -> depot_autopilot] 
[init -> depot_autopilot] --- Run "test-nic_router" (max 360 sec) ---
[init -> depot_autopilot] 
[init -> depot_autopilot] 1.215 [init -> nic_router] [uplink] static IP config: interface 10.0.2.55/24, gateway 10.0.2.1 P2P 0
[init -> depot_autopilot] 1.216 [init -> nic_router] [uplink] NIC sessions: 0
[init -> depot_autopilot] 1.220 [init -> nic_router] [t1_d1] static IP config: interface 10.0.98.33/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.220 [init -> nic_router] [t1_d1] NIC sessions: 0
[init -> depot_autopilot] 1.245 [init -> nic_router] [t1_d2] static IP config: interface 10.0.99.33/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.246 [init -> nic_router] [t1_d2] NIC sessions: 0
[init -> depot_autopilot] 1.249 [init -> nic_router] [t2_d1] static IP config: interface 18.17.16.14/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.249 [init -> nic_router] [t2_d1] NIC sessions: 0
[init -> depot_autopilot] 1.252 [init -> nic_router] [t3_d1] static IP config: interface 217.13.192.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.252 [init -> nic_router] [t3_d1] NIC sessions: 0
[init -> depot_autopilot] 1.254 [init -> nic_router] [t4_d1] static IP config: interface 192.168.1.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.255 [init -> nic_router] [t4_d1] NIC sessions: 0
[init -> depot_autopilot] 1.308 [init -> nic_router] [t5_d1] static IP config: interface 192.168.2.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.308 [init -> nic_router] [t5_d1] NIC sessions: 0
[init -> depot_autopilot] 1.309 [init -> nic_router] [t6_d1] static IP config: interface 100.200.0.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.310 [init -> nic_router] [t6_d1] NIC sessions: 0
[init -> depot_autopilot] 1.311 [init -> nic_router] [t7_d1] static IP config: interface 100.200.0.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.311 [init -> nic_router] [t7_d1] NIC sessions: 0
[init -> depot_autopilot] 1.312 [init -> nic_router] [t7_d2] static IP config: interface 10.13.78.1/24, gateway 0.0.0.0 P2P 0
[init -> depot_autopilot] 1.313 [init -> nic_router] [t7_d2] NIC sessions: 0
[init -> depot_autopilot] 1.319 [init -> nic_bridge] --- NIC bridge started (mac=01:02:03:04:05:06) ---
[init -> depot_autopilot] 1.478 [init -> nic_bridge] vmac = 02:02:02:02:02:00 ip = 10.0.2.55
[init -> depot_autopilot] 1.488 [init -> nic_bridge] vmac = 02:02:02:02:02:01 ip = 10.0.2.212
[init -> depot_autopilot] 1.501 [init -> t2_d0_c1_udp] lwIP Nic interface down
[init -> depot_autopilot] 1.514 [init -> t2_d0_c1_udp] lwIP Nic interface up address=10.0.2.212 netmask=0.0.0.0 gateway=0.0.0.0
[init -> depot_autopilot] 1.539 [init -> nic_router] [uplink] NIC sessions: 1
[init -> depot_autopilot] 1.602 [init -> report_rom] report 'nic_router -> state'
[init -> depot_autopilot] 1.615 [init -> report_rom]   <state>
[init -> depot_autopilot] 1.616 [init -> report_rom]   <domain name="t1_d1" rx_bytes="0" tx_bytes="0" ipv4="10.0.98.33/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 1.616 [init -> report_rom]   <domain name="t1_d2" rx_bytes="0" tx_bytes="0" ipv4="10.0.99.33/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 1.617 [init -> report_rom]   <domain name="t2_d1" rx_bytes="0" tx_bytes="0" ipv4="18.17.16.14/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 1.617 [init -> report_rom]   <domain name="t3_d1" rx_bytes="0" tx_bytes="0" ipv4="217.13.192.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 1.618 [init -> report_rom]   <domain name="t4_d1" rx_bytes="0" tx_bytes="0" ipv4="192.168.1.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 1.619 [init -> report_rom]   <domain name="t5_d1" rx_bytes="0" tx_bytes="0" ipv4="192.168.2.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 1.619 [init -> report_rom]   <domain name="t6_d1" rx_bytes="0" tx_bytes="0" ipv4="100.200.0.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 1.620 [init -> report_rom]   <domain name="t7_d1" rx_bytes="0" tx_bytes="0" ipv4="100.200.0.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 1.620 [init -> report_rom]   <domain name="t7_d2" rx_bytes="0" tx_bytes="0" ipv4="10.13.78.1/24" gw="0.0.0.0"/>
[init -> depot_autopilot] 1.621 [init -> report_rom]   <domain name="uplink" rx_bytes="0" tx_bytes="0" ipv4="10.0.2.55/24" gw="10.0.2.1">
[init -> depot_autopilot] 1.621 [init -> report_rom]   <interface label="" link_state="true"/>
[init -> depot_autopilot] 1.622 [init -> report_rom]   </domain>
[init -> depot_autopilot] 1.622 [init -> report_rom]   </state>
[init -> depot_autopilot] 1.665 [init -> nic_router] [t2_d1] NIC sessions: 1
[init -> depot_autopilot] 1.674 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:00 18.17.16.14 > ff:ff:ff:ff:ff:ff 181
[init -> depot_autopilot] 1.676 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:00 18.17.16.14 > ff:ff:ff:ff:ff:ff 181
[init -> depot_autopilot] 1.683 [init -> t2_d1_s1_udp] lwIP Nic interface down
[init -> depot_autopilot] 1.686 [init -> t2_d1_s1_udp] lwIP Nic interface up address=18.17.16.15 netmask=0.0.0.0 gateway=0.0.0.0
[init -> depot_autopilot] 1.689 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:01 18.17.16.15 > 00:00:00:00:00:00 181
[init -> depot_autopilot] 1.690 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > ff:ff:ff:ff:ff:ff IPV4 18.17.16.15 > 255.255.255.255 UDP 68 > 67 DHCP 02:01
[init -> depot_autopilot] 1.734 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 ARP 02:02:02:02:02:01 18.17.16.15 > 02:02:02:02:02:00 182
[init -> depot_autopilot] 1.746 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49211 > 1 
[init -> depot_autopilot] 1.748 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49211 > 1 
[init -> depot_autopilot] 1.749 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 ARP 02:02:02:02:02:01 18.17.16.15 > 02:02:02:02:02:00 182
[init -> depot_autopilot] 1.760 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49211 
[init -> depot_autopilot] 1.763 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49211 
[init -> depot_autopilot] 1.771 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49211 
[init -> depot_autopilot] 1.775 [init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> depot_autopilot] 1.794 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49212 > 1 
[init -> depot_autopilot] 1.797 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49212 
[init -> depot_autopilot] 1.803 [init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> depot_autopilot] 1.816 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49213 > 1 
[init -> depot_autopilot] 1.818 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49213 
[init -> depot_autopilot] 1.820 [init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> depot_autopilot] 1.836 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49214 > 1 
[init -> depot_autopilot] 1.838 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49214 
[init -> depot_autopilot] 1.840 [init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> depot_autopilot] 1.854 [init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49215 > 1 
[init -> depot_autopilot] 1.855 [init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49215 
[init -> depot_autopilot] 1.858 [init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> depot_autopilot] 1.859 [init -> t2_d0_c1_udp] Test done
m-stein commented 3 years ago

On my 3992_nic_router_run_sporadic_failures branch I've created a test that triggers the failure pretty fast. I found that simply re-creating the server socket after each successful transmission causes the system to get stuck at some point, given that the client just keeps sending endlessly. In contrast, when not re-creating the server socket while the client sends endlessly doesn't trigger the fault except it was triggered before the first transmission. This indicates that the problem is related to socket creation at the server side.

Furthermore, the log shows that the server eventually is responding with an ICMP 3 3 packet which means "destination unreachable -> port unreachable" although the dst port remained the same:

Genode 20.11-99-gf97bb2bd17
467 MiB RAM and 63253 caps assigned to init
[init -> acpi_drv] Found MADT
[init -> acpi_drv] MADT IRQ 0 -> GSI 2 flags: 0
[init -> acpi_drv] MADT IRQ 5 -> GSI 5 flags: 13
[init -> acpi_drv] MADT IRQ 9 -> GSI 9 flags: 13
[init -> acpi_drv] MADT IRQ 10 -> GSI 10 flags: 13
[init -> acpi_drv] MADT IRQ 11 -> GSI 11 flags: 13
[init -> acpi_drv] Found MCFG
[init -> acpi_drv] MCFG BASE 0xb0000000 seg 0x0 bus 0x0-0xff
[init -> acpi_drv] RSDT OEM 'BOCHS ', table id 'BXPCRSDT', revision 1, creator 'BXPC' (1)
[init -> acpi_drv] SMBIOS table (entry point: 0x175b00 structures: 0xf5b20)
[init -> platform_drv] ECAM/MMCONF range 00000000:00000000.0-000000ff:0000001f.7 - addr [00000000b0000000,00000000c0000000)
[init -> platform_drv] Root bridge: 00000000:00000000.0
[init -> nic_router] [uplink] static IP config: interface 10.0.2.55/24, gateway 10.0.2.1 P2P 0
[init -> nic_router] [uplink] NIC sessions: 0
[init -> nic_router] [t2_d1] static IP config: interface 18.17.16.14/24, gateway 0.0.0.0 P2P 0
[init -> nic_router] [t2_d1] NIC sessions: 0
[init -> nic_bridge] --- NIC bridge started (mac=01:02:03:04:05:06) ---
[init -> nic_bridge] vmac = 02:02:02:02:02:00 ip = 10.0.2.55
[init -> nic_router] [uplink] NIC sessions: 1
[init -> nic_bridge] vmac = 02:02:02:02:02:01 ip = 10.0.2.212
[init -> t2_d0_c1_udp] lwIP Nic interface down
[init -> t2_d0_c1_udp] lwIP Nic interface up address=10.0.2.212 netmask=0.0.0.0 gateway=0.0.0.0
[init -> nic_router] [t2_d1] NIC sessions: 1
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > ff:ff:ff:ff:ff:ff IPV4 10.0.2.212 > 255.255.255.255 UDP 68 > 67 DHCP 02:02:02:02:02:01 > 01
[init -> nic_router] [uplink] rcv ETH 01:02:03:04:05:06 > ff:ff:ff:ff:ff:ff IPV4 10.0.2.212 > 255.255.255.255 UDP 68 > 67 DHCP 02:02:02:02:02:01 > 01
[init -> t2_d1_s1_udp] lwIP Nic interface down
[init -> t2_d1_s1_udp] lwIP Nic interface up address=18.17.16.15 netmask=0.0.0.0 gateway=0.0.0.0
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:01 18.17.16.15 > 00:00:00:00:00:00 18.17.16.15 cmd 1
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > ff:ff:ff:ff:ff:ff IPV4 18.17.16.15 > 255.255.255.255 UDP 68 > 67 DHCP 02:02:02:02:02:01 > 01
[init -> t2_d1_s1_udp] Error: starting server loop #1
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:01 10.0.2.212 > 00:00:00:00:00:00 10.0.2.55 cmd 1
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 ARP 02:02:02:02:02:00 10.0.2.55 > 02:02:02:02:02:01 10.0.2.212 cmd 2
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 01:02:03:04:05:06 IPV4 10.0.2.212 > 10.0.2.55 UDP 49188 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:00 18.17.16.14 > ff:ff:ff:ff:ff:ff 18.17.16.15 cmd 1
[init -> nic_router] [uplink] rcv ETH 01:02:03:04:05:06 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49188 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:00 18.17.16.14 > ff:ff:ff:ff:ff:ff 18.17.16.15 cmd 1
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 ARP 02:02:02:02:02:01 18.17.16.15 > 02:02:02:02:02:00 18.17.16.14 cmd 2
[init -> nic_router] [uplink] rcv ETH 01:02:03:04:05:06 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49188 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49188 > 1 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 01:02:03:04:05:06 IPV4 10.0.2.212 > 10.0.2.55 UDP 49188 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49188 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #2
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 ARP 02:02:02:02:02:01 18.17.16.15 > 02:02:02:02:02:00 18.17.16.14 cmd 2
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49188 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:00 10.0.2.55 > ff:ff:ff:ff:ff:ff 10.0.2.212 cmd 1
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > ff:ff:ff:ff:ff:ff ARP 02:02:02:02:02:00 10.0.2.55 > ff:ff:ff:ff:ff:ff 10.0.2.212 cmd 1
[init -> nic_router] [uplink] rcv ETH 01:02:03:04:05:06 > 02:02:02:02:02:00 ARP 01:02:03:04:05:06 10.0.2.212 > 02:02:02:02:02:00 10.0.2.55 cmd 2
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49188 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49188 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49188 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 ARP 02:02:02:02:02:01 10.0.2.212 > 02:02:02:02:02:00 10.0.2.55 cmd 2
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 ICMP 3 3
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:01 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 ICMP 3 3
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 ICMP 3 3
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49189 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49189 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #3
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49189 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49189 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49189 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49190 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49190 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #4
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49190 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49190 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49190 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49191 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49191 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #5
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49191 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49191 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49191 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49192 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49192 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #6
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49192 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49192 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49192 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49193 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49193 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #7
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49193 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49193 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49193 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49194 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49194 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #8
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49194 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49194 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49194 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49195 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49195 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #9
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49195 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49195 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49195 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49196 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49196 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #10
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49196 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49196 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49196 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49197 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49197 > 1 
[init -> t2_d1_s1_udp] Error: starting server loop #11
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 UDP 1 > 49197 
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49197 
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:00 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 UDP 1 > 49197 
[init -> t2_d0_c1_udp] Received "UDP server at 10.0.2.55:1 ..."
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 10.0.2.212 > 10.0.2.55 UDP 49198 > 1 
[init -> nic_router] [t2_d1] snd ETH 02:02:02:02:02:00 > 02:02:02:02:02:01 IPV4 10.0.2.212 > 18.17.16.15 UDP 49198 > 1 
[init -> nic_router] [t2_d1] rcv ETH 02:02:02:02:02:01 > 02:02:02:02:02:00 IPV4 18.17.16.15 > 10.0.2.212 ICMP 3 3
[init -> nic_router] [uplink] snd ETH 02:02:02:02:02:01 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 ICMP 3 3
[init -> nic_router] [uplink] rcv ETH 02:02:02:02:02:01 > 01:02:03:04:05:06 IPV4 10.0.2.55 > 10.0.2.212 ICMP 3 3
m-stein commented 3 years ago

Btw., the number of socket re-creations at the server side (i.e. number of server loops / transmissions) at which the error occurs always differs but is normally below 100 AFAIS.

chelmuth commented 3 years ago

Thanks, let's sum this up. It's seems in some cases UDP socket creation or port binding gets stuck in the libc. So, the IP stack is still responsive (see the ICMP packets) but the application makes no progress. I think we can pick up this issue with your findings later.

chelmuth commented 3 years ago

This issue may be related to the complex PCB handling mentioned in #3835.

nfeske commented 1 year ago

The test looks good now.

m-stein commented 1 year ago

@nfeske I re-enabled test 2 in the run script ...

proc enable_test_2 { } { return 1 }

... and called this ...

x86_64$ while make run/nic_router KERNEL=nova BOARD=pc; do :; done

... and it still triggers this issue after a hand full of iterations. Should I re-open the issue?

nfeske commented 1 year ago

Thank you @m-stein for pointing this out.

I must admit that I'm divided. On the one hand, the failed test points at a possible deficiency, which would - in principle - call for investigation. On the other hand, the issue remained stale for two years. Apparently nobody was bothered enough by it during that time. By being present yet unattended, it merely remains as a distraction for those who try to keep track of the open issues and monitor the results of our automated tests.

Should I re-open the issue?

If you are going to attack it, this would be the best way. If you don't, and nobody else does, I'd remove it from our view to keep the noise at bay. How would you decide?

m-stein commented 1 year ago

Thanks for your detailed feedback! I think we should go with the approach you described. I'll leave the issue closed and re-open it as soon as I want to tackle it. The problem remains documented in the run script and isn't related to the main objective of the test anyway.