munin-monitoring / munin

Main repository for munin master / node / plugins
http://munin-monitoring.org
Other
1.95k stars 469 forks source link

How to display and make a graph consisting of data from other graphs #1580

Open nonono147 opened 9 months ago

nonono147 commented 9 months ago

Hello,

I would like to create an "architecture" with a server which will only take care of the display, servers there will be collection servers which will act as a bridge between the different nodes and the display server. (800+ nodes)

(This architecture is for a question of performance with so many nodes)

How can I do that ? Is it possible ? I heard about virtual nodes, could this do the trick?

Thanks ; ) Mathieu

niclan commented 9 months ago

The munin collection and web interface code is quite solidly made to run on one node. Also there is a hard requirement for data updates every 5 minutes - but spoolfetching might be a way around this.

At work I run a munin 2.999 git snapshot with ~200 munin-nodes to collect from. It's main problem is that the update process takes about 4 minutes and 30 seconds, so it only barely completes in time for the 5 minute deadline/collection cycle. As far as I recall munin 2.0 also needed about the same time.

Some posibilities....

Collection hierarchy: probably not

You might be able to make a collection hierarchy, in various ways, but there will be a limit in that all of the data needs to get back the rrd files on one central server every 5 minutes. On the plus side RRD files are very performant compared to influxdb and prometheus (I've estimated that RRD was 10x to 100x faster in a quick experiment I ran quite a few years ago).

Spoolfetch: quite probably

If the munin nodes are set up for asyncronous spoolfetch of data you should be able to run the munin-update process less than every 5 minutes and still get all of the data collected eventually. This gives delayed graphs, but the data will be there eventually. I've not set this up myself and I'm not even sure which version of munin you'll need to run.

Split collection and web interface on two servers: definitive maybe

If you have very good and performant network filesystems (NFS on a Netapp should do it, and NFS server on Linux probably not) you might be able to do collection on one node and the web interface on another node. The web interface will have to be CGI, not cron driven. To make flushing of writes from rrdcached on the collection node available to the web node you'll have to use a network socket instead of a unix: socket.

The danger here is that NFS is not a great filesystem for databases, especially not across client nodes which might lead to a incomplete or in-operational web interface from time to time. I don't know if this will be practical or not. If you run munin 2.999 you should put the database on a third server running mysql/mariadb instead of on sqlite in a local file.

But: Munin/rrd is easily capable of collecting data into many tens of thousands of rrd files, quite possibly hundreds of thousands. One plugin will produce at least one rrd file. Some plugins, like diskstats or mysql, will produce N rrd files pr. disk or database in a system.

Multiple independent munin server: yes

The surest and easiest way to split this, if needed: is to have separate servers for separate parts of your infra.