Open adamparker opened 4 weeks ago
Thanks for using Icinga DB and reporting your crash.
It seems like the issue roots in your long argument key. The current schema defines: https://github.com/Icinga/icingadb/blob/27d27d45396a249cfb75fe1993d6360540e2cc0d/schema/pgsql/schema.sql#L838
However, your key extends 64 chars:
>>> len("java.class.that.was.used.as.an.argument.that.was.eighty.seven.characters.long.and.broke")
87
Edit: My brain shut down after reading the first parts of your java class path.. My char counting was not necessary. Sorry :)
Thanks for responding and highlighting the issue. Our brains all shut down reading java classes because we're so conditioned to its verbose-ness.
What would be the suggested workaround for argumentless parameters that are over 64 chars long?
I feel my bash script approach is a bit flakey and was curious if there was a better solution.
First, I will compare how the "old" Icinga 2 IDO handled this. Then I would try to find out why it was limited to 64 chars and look for any platform specific limitations. If there are none, I would increase the column size. Please allow me some time to look at this.
In the meantime, I would suggest that you continue with your mitigation of the wrapper script. I would advise against manually altering the schema on your end, as this could have unintended side effects and break future schema upgrades.
I have integrated your configuration in my test setup and was directly able to reproduce the Icinga DB crash. However, Icinga 2's IDO continued to work.
object CheckCommand "icingadb-i791" {
import "plugin-check-command"
command = [ "/bin/true" ]
arguments = {
"java.class.that.was.used.as.an.argument.that.was.eighty.seven.characters.long.and.broke" = {
value = "F"
}
}
}
apply Service "crash icingadb i791" {
import "generic-service"
check_command = "icingadb-i791"
assign where host.address
}
As I was unable to find a direct match for Icinga DB's eventcommand_argument
table in the IDO schema, I just dumped the IDO database. And, as it turns out, the classpath-like argument appeared nowhere. It is not stored in the IDO database at all.
Then there was Icinga/icinga2#9887, which has some similarity to this issue. There, the argument length was made to fit the length of the platform, which I will look at next.
As always, the limits are platform- and architecture-specific. tl;dr: The reviewed operating systems have limits far beyond 64 characters.
Based on this very detailed Stack Exchange post, on a non-ancient Linux system, both ARG_MAX
and MAX_ARG_STRLEN
limit the parameter length for strings being passed to execve
. It appears that there has been no change in the ten years between that post and today, resulting in a length of 131072 chars.
$ uname -srvm
Linux 6.10.4 #1-NixOS SMP PREEMPT_DYNAMIC Sun Aug 11 10:58:04 UTC 2024 x86_64
$ getconf ARG_MAX
2097152
$ echo $(($(getconf PAGE_SIZE) * 32))
131072
Similar limits are being enforced for other Unix-like operating systems, as this document explains. For example, that's the limit I get on a current OpenBSD:
$ uname -srvm
OpenBSD 7.6 GENERIC#237 amd64
$ sysctl kern.argmax
kern.argmax=524288
While I am not very familiar with Windows, its documentation notes a limit of 8191 characters.
As written above, the Icinga DB schema limits eventcommand_argument.argument_key
as varchar(64)
. In the Go code, the struct
field is a string
.
https://github.com/Icinga/icingadb/blob/27d27d45396a249cfb75fe1993d6360540e2cc0d/schema/pgsql/schema.sql#L944 https://github.com/Icinga/icingadb/blob/7c068d4adf169181b4d68dcd6222a18b6990b583/pkg/icingadb/v1/command.go#L21
It seems like there is no usage from the Go end, but its available there mostly for Icinga DB Web.
However, it should be noted that there are multiple tables with an argument_key
:
checkcommand_argument
(the table in question),eventcommand_argument
andnotificationcommand_argument
.Then, there are is also the argument_key_override
column and even the {check,event,notification}command_envvar
tables, having potential identical issues.
Going back to the first schema, being introduced in 05d5e97dd5b73d0792e3416fd66b9257b752fd0c, this field was always from the type varchar(64)
.
As a first test, I have just altered the type from varchar(64)
to text
just for eventcommand_argument.argument_key
, which resolved this specific error. I am going to create a PR addressing this and the similar issues outlined above.
What would be the suggested workaround for argumentless parameters that are over 64 chars long?
I just had another idea in this regard. As being argument-less, you can revert the logic and make it key-less. To be precise, set skip_key
for your CheckCommand Argument and set the value
accordingly. Something like:
object CheckCommand "icingadb-i791" {
import "plugin-check-command"
command = [ "/bin/true" ]
arguments = {
// "java.class.that.was.used.as.an.argument.that.was.eighty.seven.characters.long.and.broke" = {
// value = "oops."
// }
"(icingadb-i791)" = {
skip_key = true
value = "java.class.that.was.used.as.an.argument.that.was.eighty.seven.characters.long.and.broke"
}
}
}
Hi,
I can confirm that the skip_key workaround works fine.
Thanks for your help.
Describe the bug
I am uncertain as to whether this is a bug or something that needs more warnings in Icinga.
I was running a check command with a java library that was loaded like this:
java -cp "lib/*" java.class.that.was.used.as.an.argument.that.was.eighty.seven.characters.long.and.broke ....omitted
The class parameter has no flag/option/argument prior, so this may be part of the issue.
In the command template I put this as an argument which all worked fine from the check perspective.
However icingadb didn't like it:
Expected behavior
A warning that argument keys have a limit, or this is a bug.
I have resolved this by using a bash script wrapper. What advice would you give for anyone else who has a similar issue?
Your Environment
Include as many relevant details about the environment you experienced the problem in
Using the following docker images: "icinga/icingaweb2:2.12.1" "icinga/icinga2:2.14.2" "icinga/icingadb:1.2.0" "redis:7.2.5"
Additional context
I ran this in a docker environment with MySQL.