Icinga / icingaweb2

A lightweight and extensible web interface to keep an eye on your environment. Analyse problems and act on them.
https://icinga.com/get-started/
GNU General Public License v2.0
806 stars 280 forks source link

Service display by service_severity doesn't take care of dependencies #3355

Open yasa1987 opened 6 years ago

yasa1987 commented 6 years ago

Hello, I noticed that Dependencies are not taken in account when services are displayed in the dashboard, even if the "sort=service_severity" is the in the URL.

Services are also not displayed as "handled".

I'm monitoring a network router chassis that has line cards (iom) in which we plug modules (mda) that contain the physical ports. The dependencies between services are configured as follows :

In short terms : port depends on mda who depends on iom

In my scenario, I plugged out the line card (iom) to simulate a failure. The attached modules (mda) and ports will check script will return a CRITICAL state as they only can work if the line card is present.

Expected Behavior

When the line card (iom) service is CRITICAL, the attached modules (mda) services should be HANDLED if they also are in CRITICAL state, and the ports services should also be HANDLED if they are critical.

The line card (iom) service should be at the top of the list as other services depends on it.

Current Behavior

The module (mda) services who depend on the CRITICAL line card are not marked as handled. The port services who depend on the CRITICAL modules (mda) services are not marked as handled.

The root cause of the issue (CRITICAL iom service) is not at the top of the service problem list.

The strange thing is that the icingacli monitoring list services command puts the "iom" CRITICAL service at the top of the list.

Context

My router chassis (Nokia 7750 SR-7) is composed of line cards (iom) that contains modules (mda) who contains ports. All those elements are monitored with their own service. If a module (mda) fails, all the ports will be in CRITICAL state, but it's because of the mda failure. The MDA failure must be on top of issues list. Same thing for the line card (iom) regarding MDA.

Service-to-Service dependency should do the job (I suppose their are meant for that, right?)

Your Environment

dnsmichi commented 6 years ago

Dependency calculation isn't possible as the IDO backend doesn't provide all the dependencies. The is_reachable attribute might be a possible candidate, although I'm not sure how and with what weight it should influence severity state calculation.

lippserd commented 6 years ago

Hi,

Thanks for the report. We'll evaluate whether it's a good idea to take is_reachable for severity calculation into account.

Cheers, Eric