Icinga / icinga2

The core of our monitoring platform with a powerful configuration language and REST API.
https://icinga.com/docs/icinga2/latest
GNU General Public License v2.0
2.03k stars 578 forks source link

icinga2 object list --type downtime returns 0 #8687

Closed btrnka63 closed 3 years ago

btrnka63 commented 3 years ago

Hello,

Situation: we would like to migrate the master to another location. As for that, we performed necessary installation steps and connected new node as secondary for existing master. As next step, we would like to switch the secondary to be the master node. But during the switch we noted an issue with downtime objects. Downtime can be created manually via web, they are also visible via web, but command ‘icinga2 object list --type downtime’ returns empty list.

Running http request to “https://localhost:5665/v1/objects/downtimes” returns expected long list.

To Reproduce

With above mentioned setup (master with two nodes) we manually created downtime using the web UI. Checking the output from icinga2 CLI "cinga2 object list --type downtime" returns "0". Tested also with Scheduled Downtime feature:

  1. create apply ScheduledDowntime rule using the web UI and deploy changes.
  2. remove the above apply rule and deploy the change (in this step we can see the Downtime objects running the icinga2 CLI command even the apply rule is removed and no ScheduledDowntime object is present)
  3. re-created the apply ScheduledDowntime rule via web UI and deploy changes (in here no Downtime objects visible using the icinga CLI)

Expected behavior

Outputs from icinga2 CLI and API (http request to 5665) should provide same output (content-vise). Icinga2 CLI should not return empty list when Downtime created (manually or using ScheduledDowntime apply rule)

Your Environment

Include as many relevant details about the environment you experienced the problem in

Wintermute2k6 commented 3 years ago

ref/NC/704757

julianbrost commented 3 years ago

Do the downtimes show up after you either reload Icinga 2 or run icinga2 daemon -C? icinga2 object list gets its information from a cache file (/var/cache/icinga2/icinga2.debug) that's only updated when one of these actions happens.

Querying the API will give you more up-to-date information as this just returns the information of the currently running process from memory.

btrnka63 commented 3 years ago

Hi,

No, downtime objects do not appear after the reload, but after restart yes (reload = after deployment; restart = after icinga2 service restart). Thanks for the additional info, we will use the API to get the "active" object list.

(details clarified as part of the ref/NC/704757 ticket; closing)

btrnka63 commented 3 years ago

Hi,

I reopened the topic as it's still not clear why NO object returned by the "icinga2 object list --type downtime" while having 100-150 thousands files within the "../_api/../conf.d/downtimes" directory.

If the "icinga2 object list" reads the "/var/cache/icinga2/icinga2.debug", why they are not updated there? What could be the reason?

I can list host, services, endpoints ... all but downtimes.

Thanks for any advise!

julianbrost commented 3 years ago

When you run the command icinga2 daemon -C, it should output a line like information/ConfigItem: Instantiated 2 Downtimes.. What's the number you get there? Does that match what you expect?

If you get the correct number there, please also run grep -o '"type":"Downtime"' /var/cache/icinga2/icinga2.debug | wc -l to see if all of them were correctly written to the icinga2.debug file. The number you see there should be twice the number of downtimes as that string appears in each object twice.

If that number is not correct, what's the number that was logged on the last start? You should be able to find that line in journalctl -u icinga2.

btrnka63 commented 3 years ago

Hi,

Thanks for your input. Issue was solved in meantime during the support session with the developer. For some reason the "include.conf" file was missing within the "_api/<stage number>/" directory and the <stage number> has not been changed. Solved by dropping whole "_api" directory (all downtimes + comments were dropped as well - we recreated them from history tables).