While troubleshooting some inconsistencies reported for an EVC, I found a bug where when a Link Down affects both the current_path and also the failover_path, the failover_path can be left behind in an inconsistent state. Please see more details below.
The symptom was basically the the external script for consistency check that runs every 30min reporting the following inconsistency:
So, in the case above, it probably happened that part of the failover path got removed when handling the redeploy on the current path, because they both share some switches (ultimately leaving the Failover path inconsistent with some flows missing).
Checking the code, it seems like when an EVC is affected by Link Down, and its failover_path is also affected by the same link down, the EVC link down handler goes through the normal process, but when it is redeployed and generates the kytos/mef_eline.(redeployed_link_down KytosEvent, then the evc.try_setup_failover_path() will result in a early return because the failover_path is still present: https://github.com/kytos-ng/mef_eline/blob/a47e289c1896b913bf524d4357efc64b94d93ff1/models/evc.py#L898-L902
While troubleshooting some inconsistencies reported for an EVC, I found a bug where when a Link Down affects both the current_path and also the failover_path, the failover_path can be left behind in an inconsistent state. Please see more details below.
The symptom was basically the the external script for consistency check that runs every 30min reporting the following inconsistency:
After investigating what happened at this date/time, I found a link flap that indeed impacted that EVC:
Checking the previous saved state of this EVC above, I found check that both current_path and failover_path shares some switches/links:
So, in the case above, it probably happened that part of the failover path got removed when handling the redeploy on the current path, because they both share some switches (ultimately leaving the Failover path inconsistent with some flows missing).
Checking the code, it seems like when an EVC is affected by Link Down, and its failover_path is also affected by the same link down, the EVC link down handler goes through the normal process, but when it is redeployed and generates the kytos/mef_eline.(redeployed_link_down KytosEvent, then the evc.try_setup_failover_path() will result in a early return because the failover_path is still present: https://github.com/kytos-ng/mef_eline/blob/a47e289c1896b913bf524d4357efc64b94d93ff1/models/evc.py#L898-L902