Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
2.03k stars 577 forks source link

Problem with daemonize (init scripts, -d) on Debian 8 / CentOS 6 / Ubuntu 14 / SLES 11 in 2.9 #6445

Closed yoshi314 closed 6 years ago

yoshi314 commented 6 years ago

I just upgraded 2.8.4 to 2.9.0 on jessie. The majority of changes were in features-available/*.conf files where i had to remove "library (...)" lines, as the updates suggested. Rest of the config is unchanged

Icingaweb cannot connect to icinga2 instance. Icinga2 seems to run, but does not execute any checks whatsoever.

This was tested on master and satellite.

dnsmichi commented 6 years ago

Thanks everyone for the rich details and motivating help.

peteeckel commented 6 years ago

I just updated both my test agent and after that went smoothly (and being really courageous) my master server to the current snapshot release 2.9.0.9.g15a8f87 on CentOS 6.

For the time being, both are alive and doing fine. Thanks for your great work fixing this so quickly!

nbuchwitz commented 6 years ago

Just installed the fresh packages ( icinga2-2.9.0.9.g15a8f87) from the build system: Everything seems to be working on CentOS 6. Thanks a lot!

matthewsaum commented 6 years ago

Installed the icinga2-2.9.0.9.g15a8f87 packages onto my SLES11 machine, looks like everything is good! Thanks for the quick fix.

chrostek commented 6 years ago

Ubuntu build failed :-( cannot test yet

dnsmichi commented 6 years ago

I see. That sort of comes from the release yesterday (all Debian based packages fail now), we'll investigate on the infrastructure.

stearz commented 6 years ago

Thanks for the quick fix. I can confirm, that the fresh package solved the issue on Amazon Linux as well.

Icebird2000 commented 6 years ago

On Gentoo now it works also.

But the Client TLS handshake error still exist as already noted.

Context: (0) Handling new API client connection

Inplace downgrade to icinga2 2.8.4 works as expected

widhalmt commented 6 years ago

I was brave and updated to the latest packages

My master is on CentOS 6. For now it's a single master system.

For now all works as expected. I'll keep you updated when problems arise.

Al2Klimov commented 6 years ago

@widhalmt However you don't need our snapshot pkgs if Icinga 2 is run either via systemd (with our service file) or in foreground (e.g. Docker).

BarbUk commented 6 years ago

Thanks for the fantastic work guys ! I did not test the debian stretch snapshot build, it's still in a failed status.

But the version 2.9.0-1.stretch is working nicely since yesterday after reloading the daemon.

dnsmichi commented 6 years ago

Snapshot packages depend on unit tests running successful. With the change of initializing the timer thread in this issue, the setup of the those tests fails once it terminates (aka it let's us know that we didn't join the remaining thread, not really a problem but still it stops the snapshot builds). We are investigating on this in #6461. This only affects Debian/Ubuntu builds, for some reason CentOS/SLES "don't care".

@Icebird2000 I don't think that the connection problem is related to this issue, please create a new one and collect all the details (zones.conf, connection logs, etc.) there.

dnsmichi commented 6 years ago

We've revised the current quick-fix, thanks @Al2Klimov for the patch and @gunnarbeutner for the offline discussions. #6467 fixes the unit tests too.

I've triggered new snapshot builds for all distributions, you can watch their status here: https://build.icinga.com/view/Icinga%202/job/icinga2-snapshot/

Please test them once available and add your results here. We'll plan with a 2.9.1 soon after your tests.

chrostek commented 6 years ago

snapshot 2.9.0+11.g95d46f5.2018.07.23+1.trusty-0 works as expected

draeklae commented 6 years ago

snapshot version 2.9.0.11.g95d46f5-0.2018.07.23+1.icinga working like a charm on SLES 11.4.

peteeckel commented 6 years ago

2.9.0.11.g95d46f5-0.2018.07.23 on CentOS 6.10, Master and one Agent.

Runs without problems, just the startup log gets truncated again:

[2018-07-23 14:31:49 +0000] information/ApiListener: Adding new listener on port '5665'
[2018-07-23 14:31:49 +0000] information/ConfigItem: Committing config item(s).
[2018-07-23 14:31:49 +0000] information/ConfigItem: Committing config item(s).
[2018-07-23 14:31:49 +0000] information/ConfigItem: Committing config item(s).
[2018-07-23 14:31:49 +0000] information/ConfigItem: Committing config item(s).
[2018-07-23 14:31:49 +0000] information/GraphiteWriter: 'graphite' started.
[2018-07-23 14:31:49 +0000] information/ConfigItem: Instantiated 1 Downtime.
[2018-07-23 14:31:49 +0000] i

... that's all. It worked in the last snapshot, at least the last line was complete the two times I ventured a restart.

Update: It appears to have been pure coincidence. My 2.8.4 agent on CentOS 6.10 truncates the startup log as well at the end:

[2018-07-10 17:59:34 +0000] information/ConfigItem: Instantiated 1 ApiListener.
[2018-07-10 17:59:34 +0000] information/ConfigItem: Instantiated 3 Zones.
[2018-07-10 17:59:34 +0000] information/ConfigItem: Instantiated 2 Endpoints.
[2018-0

Is this expected behaviour or should I open another issue?

vladimir-mencl-eresearch commented 6 years ago

Hi,

Cannot install the snapshot on Debien-stretch - libicinga2 has not been rebuilt and icinga-stretch-snapshots only has libicinga2 2.9.0.2018.07.18+1.stretch-0, but needs 2.9.0+11.g95d46f57c.2018.07.23+1.stretch-0

But snapshots aside - when is it likely this fix would be released in the main channel?

Cheers, Vlad

dnsmichi commented 6 years ago

@peteeckel I'd assume this is a different problem with flushing the log on restart, e.g. when file handles are closed prior to writing the entire line. Please create a follow up issue.

@vladimir-mencl-eresearch right, apologies. The build system sometimes produces weird integrity problems with aptly and naming schemas, I've poked @Crunsher to have a look.

In terms of 2.9.1, I am likely going to work on this after a chat with @lippserd today.

dnsmichi commented 6 years ago

Release is in progress. Waiting for the package builds.