openBackhaul / MicroWaveDeviceInventory

Physical and logical inventory of the MW SDN Domain
Apache License 2.0
5 stars 1 forks source link

MWDI sliding window priority function causing uneven distribution of requests resulitng in heavy load on Mediators #972

Open ssomasundara opened 2 months ago

ssomasundara commented 2 months ago

Problem Background: MWDI caching all devices as per the list of connected devices updated to the sliding window. When a set of devices are disconnected due to maintenance window - During the realignment process, the list of connected devices are updated and the disconnected devices are dropped. Once the maintenance window is completed and all device connections are restored. MWDI will know the list of connected devices using NP or during the realignment process. During the realigment process, MWDI is prioritizing the devices which are restored after mainteance. These devices are sequenced in the sliding window with priority. Thousands of Requests are addressed to the devices on the same mediator causing heavy load to Mediator. Secondly, the priority sequence is not changed in MWDI during every subsequent realignment process. So every day Thousands of Requests are addressed to the devices on the same mediator causing heavy load to Mediator.

We expect the same problem once Notification proxy is enabled in production due to the DCN fluctuations (repeatedly receiving coneecting/disconnecting notifications).

Current Work around: We are forced to restart MWDI application after every maintenance activity on the underlying components.

ssomasundara commented 2 months ago

Hi Katharina and Thorsten,

Today we discussed the issue in the OPS call.

A method of procedure shall be defined to build redundancy for component under maintenance. We will prepare a design.

Open Point: Incase of Incident (Mediator crashed), MWDI will create the same problem. Not sure how we can handle this problem.

For repeated notifications from DCN flapping devices Needs resolution.