kytos-ng / topology

Kytos Main Topology Network Application (NApp)
https://kytos-ng.github.io/api/topology.html
MIT License
0 stars 8 forks source link

[2022.3] bug: `kytos/topology.link_down` has been sent twice #195

Closed viniarck closed 8 months ago

viniarck commented 8 months ago

@italovalcy has observed this issue in prod running on 2022.3:

2024-03-21 12:13:18,066 - INFO [kytos.napps.kytos/of_core] [main.py:652:update_port_status] (MainThread) PortStatus modified interface 00:00:00:00:00:28:00:01:2 state OFPPS_LINK_DOWN
2024-03-21 12:13:18,093 - INFO [kytos.napps.kytos/of_core] [main.py:652:update_port_status] (MainThread) PortStatus modified interface 00:00:00:00:00:28:00:01:2 state OFPPS_LIVE
2024-03-21 12:13:18,098 - INFO [kytos.napps.kytos/of_core] [main.py:652:update_port_status] (MainThread) PortStatus modified interface 00:00:00:00:00:32:00:04:1 state OFPPS_LINK_DOWN
2024-03-21 12:13:18,127 - INFO [kytos.napps.kytos/of_core] [main.py:652:update_port_status] (MainThread) PortStatus modified interface 00:00:00:00:00:32:00:04:1 state OFPPS_LIVE
2024-03-21 12:13:22,309 - INFO [kytos.napps.kytos/mef_eline] [main.py:674:handle_link_down] (thread_pool_app_781) Event handle_link_down Link(Interface('novi_port_1', 1, Switch('00:00:00:00:00:32:00:04')), Interface('novi_port_2', 2, Switch('00:00:00:00:00:28:00:01')))
2024-03-21 12:13:22,325 - INFO [kytos.napps.kytos/mef_eline] [main.py:674:handle_link_down] (thread_pool_app_664) Event handle_link_down Link(Interface('novi_port_1', 1, Switch('00:00:00:00:00:32:00:04')), Interface('novi_port_2', 2, Switch('00:00:00:00:00:28:00:01')))
2024-03-21 12:13:57,962 - INFO [kytos.napps.kytos/mef_eline] [main.py:660:handle_link_up] (thread_pool_app_935) Event handle_link_up Link(Interface('novi_port_1', 1, Switch('00:00:00:00:00:32:00:04')), Interface('novi_port_2', 2, Switch('00:00:00:00:00:28:00:01')))

We've had some race condition fixes on 2023.1, but let's make sure to see if this still can happen or not. I'll look into it.

viniarck commented 8 months ago

@italovalcy, I think on 2022.3.0 this was indeed just a consequence of the fact that we didn't notify only once stronger guarantees and here's what happened from what I can follow with the propvided logs:

I've also explored locally and tried to reproduce it on 2023.2 (trying to follow a similar pattern), but since we have stronger notify only once guarantees there it doesn't happen a we'd expect:

sudo ip link set dev s1-eth4 down
sudo ip link set dev s1-eth4 up
sudo ip link set dev s3-eth3 down
sudo ip link set dev s3-eth3 up
kytos $> 2024-03-22 12:23:18,191 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:01:4 state 0                                         
2024-03-22 12:23:18,191 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:01:4 state OFPPS_LINK_DOWN
2024-03-22 12:23:18,192 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:03:3 state 0
2024-03-22 12:23:18,192 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:03:3 state OFPPS_LINK_DOWN
2024-03-22 12:23:18,195 - INFO [kytos.napps.kytos/mef_eline] (dynamic_single_0) Event handle_link_down Link(Interface('s3-eth3', 3, Switch('00:00:00:00:00:00:00:03')), Interface('s1-eth
4', 4, Switch('00:00:00:00:00:00:00:01')), c8b55359990f89a5849813dc348d30e9e1f991bad1dcb7f82112bd35429d9b07)
2024-03-22 12:23:18,202 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:01:4 state OFPPS_LIVE
2024-03-22 12:23:18,203 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:03:3 state OFPPS_LIVE
2024-03-22 12:23:18,212 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:01:4 state 0
2024-03-22 12:23:18,213 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:01:4 state OFPPS_LINK_DOWN
2024-03-22 12:23:18,214 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:03:3 state 0
2024-03-22 12:23:18,215 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:03:3 state OFPPS_LINK_DOWN
2024-03-22 12:23:18,221 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:01:4 state OFPPS_LIVE
2024-03-22 12:23:18,222 - INFO [kytos.napps.kytos/of_core] (MainThread) PortStatus modified interface 00:00:00:00:00:00:00:03:3 state OFPPS_LIVE
kytos $>                                                                                                                                                                                 

kytos $> 2024-03-22 12:23:28,231 - INFO [kytos.napps.kytos/mef_eline] (thread_pool_app_2) Event handle_link_up Link(Interface('s3-eth3', 3, Switch('00:00:00:00:00:00:00:03')), Interface
('s1-eth4', 4, Switch('00:00:00:00:00:00:00:01')), c8b55359990f89a5849813dc348d30e9e1f991bad1dcb7f82112bd35429d9b07)

@italovalcy one thing that's a bit strange though is that it took almost 4 seconds for the first link down notification to show in the logs, unless the backend was really overloaded here, or maybe some messages might also got filtered out here in the collected logs, so if you could double check, let me know if you also have more logs. But, either way, on 2023.1+ this problem isn't expected to happen.

viniarck commented 8 months ago

Closing this since it won't be fixed on 2022.3.0. Upgrade to 2023.1+ if you end up on this issue here.