kytos-ng / mef_eline

Kytos NApp to create and manage point-to-point L2 circuits
https://kytos-ng.github.io/api/mef_eline.html
MIT License
0 stars 9 forks source link

watchdog: checking for flows of an EVC being updated more frequently than the EVC is redeployed #506

Open italovalcy opened 2 months ago

italovalcy commented 2 months ago

There are scenarios where the flows of an EVC keep being updated, removed and inserted, on each execution of flow_manager's consistency check. I understand that one has to double check why those flows are actually being considered inconsistent and fix the source of the problem. However, another case can happen and then the flows eventually get's reinstalled again. The point is: for the network operator, it is important to see that an EVC is being updated more often than it should be, and then take proper actions (including open issues to further analysis of the root cause).

Probably you are interested on steps to reproduce or for more details of the failure specifically. However please only consider that for some reason a flow could be reinstalled, removed or modified by flow_manager and the EVC won't work from time to time. Ideally, this situation should raise some flag on the EVC web UI to say that the EVC is acting weird.

I added the "watchdog" on the subject of the issue because maybe we can delegate those checks for another Napp (watchdog napp?).

viniarck commented 2 months ago

We could consider publishing events for missing and alien flows and then Napps could subscribe and derive to see which ones are being impacted at that time. Other ways would also to maybe consider less frequent data plane checks too with sdntrace, which we used to have some automation but then decided not to have anymore. There's definitely way to provide what you're looking for Italo, we'll need to discuss and refine the possibilities and approach we'll go for.