munin-monitoring / munin

Main repository for munin master / node / plugins
http://munin-monitoring.org
Other
1.98k stars 473 forks source link

munin: calculates multigraph field widths wrongly #394

Open ssm opened 9 years ago

ssm commented 9 years ago

Imported from https://bugs.debian.org/781790

Source: munin Version: 2.0.25-1~bpo70+1 Severity: normal

Hi,

So I finally got annoyed enough by this to dig into the source enough to make a reasonable bug report about it :) I've been writing a multigraph plugin for a new package, and I'd seen that sometimes it would make the field label far wider than the text was, crowding the Cur/Min/Avg/Max values hard to the right, in the worst cases even wrapping them to a new line, but just generally rendering them horribly.

I've now figured out where it's getting the wrong field width from at least, and there's probably a couple of easy fixes for that, but there might be something subtle I'm missing, so we should probably run this past upstream rather than me just attaching a naive patch for it ...

The (simplified) problem case looks something like this:

multigraph foo
foo_field.label Foo

multigraph foo.bar
bar_long_descriptive_field_name.label Bar

With the result being that when the top level 'foo' graph is generated, the width of the "Foo" label, rather than being 3, will instead be strlen("bar_long_descriptive_field_name") ...

The reason for this is that in the GraphOld.pm process_service() function we have:

$max_field_len = munin_get_max_label_length($service, \@field_order);

and @field_order for the foo graph also contains all the fields for the foo.bar subgraph (and any others it may also have).

The munin_get_max_label_length() function (in Utils.pm), in turn does this:

my $len = length (munin_get ($service->{$f}, "label") || $f);

and so when it processes the bar_long_descriptive_field_name field, the $service hash doesn't contain that member, $len instead becomes the length of the field name. Hilarity ensues.

As far as I can gather so far, the reason for the '|| $f' clause there is to support use of munin_get_max_label_length in munin_field_status() (also in Utils.pm) - so basically we have two (unintended?) side effects that collide to make this behave badly.

In process_service, the subgraph fields are skipped from further processing for the top level graph later in the loop, but the damage to $len is already done.

This could be fixed by either removing those fields before passing the array to munin_get_max_label_length, or by ignoring fields where $service->{$f} is undefined in that function when computing the length. (but I think the latter might break the munin_field_status() use). Or this might also be an indication of some other bug in the multigraph handling that they even in that array in the first place ...

That should hopefully be enough for someone familiar with how this was originally intended to work to see the right place to really fix this. AFAICS, the current upstream git HEAD doesn't look significantly different to the code in 2.0.25-1~bpo70+1 -- so hopefully any fix will also be backportable to that brance too :)

Cheers, Ron

sumpfralle commented 6 years ago

What a well-researched and detailed bug report!

Branch stable-2.0 still contains the problematic code. master is not affected.