kytos-ng / mef_eline

Kytos NApp to create and manage point-to-point L2 circuits
https://kytos-ng.github.io/api/mef_eline.html
MIT License
0 stars 8 forks source link

Link maintenance should reflect on the service being deactivated if no alternative path is found #519

Open italovalcy opened 2 weeks ago

italovalcy commented 2 weeks ago

According to the Maintenance blueprint, when a Link is under maintenance and the impacted EVC has no alternative path, the service should be deactivated. This should be done not only from the control plane perspective, but also on the dataplane perspective. Meaning: we should remove the flows from that particular EVC once the MW starts if no alternative path is found for that particular EVC.

It seems that only the control plane part is being done now.

Cc'ing @RenataFrez and @jab1982 so they can also share their perspective as network operators.

RenataFrez commented 2 weeks ago

I have the same opinion as Italo. From the operations perspective, I see two significant issues with the current approach:

  1. If the dataplane continues forwarding the traffic through a link in maintenance mode, we lose the primary goal of this mode: avoid unexpected impact to the users. For example, we generally use Maintenance Mode when a link is degraded (flapping, losing packets) or when a planned activity is scheduled. In both scenarios, we want to minimize the user impact by removing the cause of the instability. If some EVCs remain in the affected link, the users see their routing tables being recalculated, which could affect their traffic. In this case, it's better to have a link down than to switch traffic from one side to another.
  2. For the operator, seeing an EVC as Active = False but having the Dataplane still working could lead to wrong assumptions when troubleshooting an issue.
jab1982 commented 2 weeks ago

If the data plane configuration is removed, when it is time to end the maintenance, we will have no ways of testing the link to make sure it is error-free and ready for production. That was one of the reasons the process is the way it is today.

jab1982 commented 2 weeks ago

Maybe a new metadata might be created to help. If metadata "never_disable" is on, the EVC should never be removed. Then we use those for testing, for instance, with BERToD.