centreon / centreon-archived

Centreon is a network, system and application monitoring tool. Centreon is the only AIOps Platform Providing Holistic Visibility to Complex IT Workflows from Cloud to Edge.
https://www.centreon.com
GNU General Public License v2.0
574 stars 240 forks source link

[2.8.2] graph issue (not bug, only bad configuration) #4860

Closed golgoth31 closed 7 years ago

golgoth31 commented 7 years ago

BUG REPORT INFORMATION

Centreon Web version: 2.8.2

Centreon Engine version:

Centreon Broker version:

OS:

Additional environment details (AWS, VirtualBox, physical, etc.):

Steps to reproduce the issue:

  1. create a service based on linux diskIO template
  2. let all the default parameters
  3. no graph ....

Describe the results you received: graph error

Describe the results you expected: a working graph ....

Additional information you think important (e.g. issue happens only occasionally): capture d ecran de 2016-12-29 16-59-35

The centreon-plugin behind this check has been poorly designed to return as many metrics as it can which is really a bad practice. The result is that this plugin became useless in it's default configuration with the new graph library. It should be better to return less metrics by default and let the user add the others by himself.

florian-asche commented 7 years ago

The graph only dont work in mouseover mode. If you follow the link, it will work. But that is not a solution yes. I think, the default configuration should be changed, to display only the disks that are specified in the configuration. A default disk chould be "/".

masgo commented 7 years ago

This is not really a template problem, but it is caused by the new graph style which can not deal with many metrics. The old one worked flawlessly. It could show the the per-port traffic graph of a 52 port switch flawlessly. The new one fails even with a 24 port switch.

florian-asche commented 7 years ago

And the new one is much slower :(

masgo commented 7 years ago

The speed seems to depend. If the slowest part was the network (e.g., slow VPN over 3G) then the new one seems to be a little faster. But the new one definitely causes a much higher load on the browser itself.

lpinsivy commented 7 years ago

Yes we blocked chart with many curves due to hight browser load. But normally a button allows you to display graph if wanted in "Performance menu".

The best way is to filter needed values in template of service instead of collect all data

masgo commented 7 years ago

This is clearly a step backwards since it worked like a charm before.

For my example of monitoring a switch, all the data is important. If the switch has 52 Ports (i.e., 48 Ports + 4x SFP), then I want to know the state and traffic of all 52 Ports. (Actually, with VLANs and LAGs there are even more values wo monitor). If some port becomes overloaded of causes unusual high traffic, then I want to know which one it is.

The same applies for per-subnet traffic in routers, for disk-I/O when I have many disks, for temperature values when I have many sensors, etc. etc. .... If the thing I am monitoring would be small and consist of only a few things to monitor, then I would not need a monitoring solution like centreon.

What I could live with, would be a top-k output. So, when I monitor a switch, I do not care about all ports, but only about the high-traffic ports. But I can not know beforehand which ports will have the high traffic.

julienmathis commented 7 years ago

So having 104 lines on a graph is useful for you ? It's a step backward if you can read cleared and efficiently the content of the graph... It's not the case.

We decide to stop graphing this kind of datas. Last day I see a graph with 900 metrics... It was causing overload on the servers...

So for me the only bug here is the command of the plugin.

masgo commented 7 years ago

Yes, it is useful. Here is an example of a traffic graph of a switch with 52 ports. switch-traffic-example

I just need one quick look to recognize that something happened on Thursday. (In this case it was a misconfiguration which lead to traffic being send over the wrong path)

You can also easily see the maximum bandwidth of this period. You can see that the "normal" usage is very low (most connected devices are VoIP phones = low traffic). In other scenarios you could easily see if two ports have traffic spikes at the same time. etc. etc. - Do not underestimate the power of humans when it comes to pattern recognition.

Just take a look at the RRDtool example gallery to see graphs with many lines which are highly useful. https://oss.oetiker.ch/rrdtool/gallery/index.en.html

And since it is possible to filter out some graphs, I can then go ahead and focus on the ports which are of interest.

btw: how is this data stored? If I have a measurement every minute then I have 60247 (measurements per week) 52 ports 4 Byte (integer) = ~ 2 MiB of data. 2 MiB of data should not be to much to plot.

And since the image is only about 800 px wide, we can aggregate the date before plotting and we end up with less points to plot.

lpinsivy commented 7 years ago

We added an option to force display of curves