kytos-ng / mef_eline

Kytos NApp to create and manage point-to-point L2 circuits
https://kytos-ng.github.io/api/mef_eline.html
MIT License
0 stars 9 forks source link

EVC active status should start as unknown or none #341

Open italovalcy opened 1 year ago

italovalcy commented 1 year ago

Hi,

When we restart kytos, mef_eline loads all EVCs with status active=False, then after a few consistency check cycles the EVCs get the correct status (i.e., active=True or they are redeployed). From the monitoring tools perspective, it is not good having the EVCs with status active=False when server is restarted, because this ends up creating a false positive on the monitoring tool (we receive alerts that EVCs are down, when they are not actually).

We should better discuss this, and asset what would be the impact of having a status=Unknow or eve status=None at the begin

viniarck commented 1 year ago

@italovalcy, right, yes, let's evolve this.

Potentially, turning active: bool into active: Optional[bool] could solve it with just a minor breaking change in the response. But the drawback of this is that if at the EVC level if we need to also express any other logical state we'll need to break it again in the future. So, it's a great opportunity to keep thinking if you or any other network operator would also like to have more states at the EVC top level structure in the short/medium-term, and then we make a decision if we'll go with active: Optional[bool] or make a slightly bigger breaking change and evolve it into status: str.

Also, I thought of potentially considering something similar to status & status_reason, but if we do NOT have the need to aggregate at the EVC level its path status_reason, then maybe we can keep at the top level just a single attribute (also, although EVC is an subclass of GenericEntity it differs from how Interface/Link/Switch are activated/shared, so we don't really need to follow a closely semantics with them regarding these two attributes and we have more freedom here). A good exercise is to check with you and the Ops team if given an interruption if you all are OK with how it's being exposed, currently, I think it's very reasonable, @Ktmi did a great job, it'll be at the path level (for static paths, for dynamic paths they won't be found since pathfinder won't include interrupted links in the graph):

                "status": "DOWN",
                "status_reason": [
                    "maintenance"
                ]