Closed dirtyren closed 5 years ago
can you increase the loglevel to get details on what exactly takes so long? Shouldn't take more than a few seconds with that amount of services usually.
First of all, tks @sni for the quick response and sorry that took me so log to get back, but I found out what as happening. This particular installation that was showing this time to load the cache on LMD has used nagios since version 1.0b6 so you can imagine how may comments there were on the retention file. Check the log out: [2019-01-17 09:49:45][Info][peer.go:222] [Naemon] starting connection [2019-01-17 09:50:45][Debug][peer.go:1077] [Naemon] got status answer: size: 0 kB [2019-01-17 09:50:45][Debug][peer.go:1397] [Naemon] fetched 1 initial status objects [2019-01-17 09:50:45][Debug][peer.go:1516] [Naemon] remote connection Naemon flag set [2019-01-17 09:50:45][Debug][peer.go:1077] [Naemon] got timeperiods answer: size: 5 kB [2019-01-17 09:50:45][Debug][peer.go:1397] [Naemon] fetched 61 initial timeperiods objects [2019-01-17 09:50:45][Debug][peer.go:1077] [Naemon] got contacts answer: size: 30 kB [2019-01-17 09:50:45][Debug][peer.go:1397] [Naemon] fetched 461 initial contacts objects [2019-01-17 09:50:45][Debug][peer.go:1077] [Naemon] got contactgroups answer: size: 6 kB [2019-01-17 09:50:45][Debug][peer.go:1397] [Naemon] fetched 18 initial contactgroups objects [2019-01-17 09:50:45][Debug][peer.go:1077] [Naemon] got commands answer: size: 70 kB [2019-01-17 09:50:45][Debug][peer.go:1397] [Naemon] fetched 541 initial commands objects [2019-01-17 09:50:57][Debug][peer.go:1077] [Naemon] got hosts answer: size: 5107 kB [2019-01-17 09:50:57][Debug][peer.go:1397] [Naemon] fetched 1326 initial hosts objects [2019-01-17 09:50:57][Debug][peer.go:1077] [Naemon] got hostgroups answer: size: 11 kB [2019-01-17 09:50:57][Debug][peer.go:1397] [Naemon] fetched 14 initial hostgroups objects [2019-01-17 09:55:42][Debug][peer.go:1077] [Naemon] got services answer: size: 112547 kB [2019-01-17 09:55:45][Debug][peer.go:1397] [Naemon] fetched 31287 initial services objects [2019-01-17 09:55:45][Debug][peer.go:1077] [Naemon] got servicegroups answer: size: 3 kB [2019-01-17 09:55:45][Debug][peer.go:1397] [Naemon] fetched 8 initial servicegroups objects [2019-01-17 09:56:12][Debug][peer.go:1077] [Naemon] got comments answer: size: 200058 kB
5min to retrieve services and around a minute for the comments. Once I removed all comments from the retention file and restarted naemon, LMD created the cache in 10s, instead of 8min. Any idea why this would take so long witth comments? Tks.
Alessandro.
how many comments have there been? I am also a bit concerned about the amount of data. 200MB comment data sounds quite impressive, also over 100MB services data. Can you tell if that is mostly because of large plugin output or maybe because of large number of contacts. How much of that time is used for receiving the actual data? Anyway, it shouldn't take longer than a few seconds.
Here are the stats that took 8m to load up on LMD. I just removed the comments from the retention file and restarted naemon to bring the time down to 10s.
24817 hostcomments 76948 servicecomments
Checking objects... Checked 31404 services. Checked 1331 hosts. Checked 463 contacts. Checked 14 host groups. Checked 8 service groups. Checked 18 contact groups. Checked 543 commands. Checked 61 time periods. Checked 0 host escalations. Checked 0 service escalations.
Checking for circular paths... Checked 1331 hosts Checked 0 service dependencies Checked 0 host dependencies Checked 61 timeperiods
i just tried and added about 100k comments to a test instance, but LMD starts and syncronizes them in less than 1 second. Is this still an issue? Could please you try the latest master?
I just tested it and the latest version is working all right, bellow the logs comparing version 1.3.3 with 1.5.0.
lmd 1.3.3 82986 comments - 9m39.675114476s [2019-05-06 10:47:28][Debug][peer.go:1391] [OpMon] fetched 1 initial status objects [2019-05-06 10:47:28][Debug][peer.go:1508] [OpMon] remote connection Naemon flag set [2019-05-06 10:47:28][Debug][peer.go:1071] [OpMon] got timeperiods answer: size: 5 kB [2019-05-06 10:47:28][Debug][peer.go:1391] [OpMon] fetched 61 initial timeperiods objects [2019-05-06 10:47:28][Debug][peer.go:1071] [OpMon] got contacts answer: size: 30 kB [2019-05-06 10:47:28][Debug][peer.go:1391] [OpMon] fetched 470 initial contacts objects [2019-05-06 10:47:28][Debug][peer.go:1071] [OpMon] got contactgroups answer: size: 6 kB [2019-05-06 10:47:28][Debug][peer.go:1391] [OpMon] fetched 18 initial contactgroups objects [2019-05-06 10:47:28][Debug][peer.go:1071] [OpMon] got commands answer: size: 71 kB [2019-05-06 10:47:28][Debug][peer.go:1391] [OpMon] fetched 546 initial commands objects [2019-05-06 10:47:50][Debug][peer.go:1071] [OpMon] got hosts answer: size: 5136 kB [2019-05-06 10:47:50][Debug][peer.go:1391] [OpMon] fetched 1343 initial hosts objects [2019-05-06 10:47:50][Debug][peer.go:1071] [OpMon] got hostgroups answer: size: 11 kB [2019-05-06 10:47:50][Debug][peer.go:1391] [OpMon] fetched 13 initial hostgroups objects [2019-05-06 10:56:17][Debug][peer.go:1071] [OpMon] got services answer: size: 117993 kB [2019-05-06 10:56:21][Debug][peer.go:1391] [OpMon] fetched 31793 initial services objects [2019-05-06 10:56:21][Debug][peer.go:1071] [OpMon] got servicegroups answer: size: 3 kB [2019-05-06 10:56:21][Debug][peer.go:1391] [OpMon] fetched 8 initial servicegroups objects [2019-05-06 10:56:55][Debug][peer.go:1071] [OpMon] got comments answer: size: 297716 kB [2019-05-06 10:57:08][Debug][peer.go:1391] [OpMon] fetched 53473 initial comments objects [2019-05-06 10:57:08][Debug][peer.go:1071] [OpMon] got downtimes answer: size: 0 kB [2019-05-06 10:57:08][Debug][peer.go:1391] [OpMon] fetched 0 initial downtimes objects [2019-05-06 10:57:08][Info][peer.go:589] [OpMon] objects created in: 9m39.675114476s
lmd 1.5.0 82986 comments - 19.433867867s [2019-05-06 10:57:44.172][Debug][peer.go:1231] [OpMon] got contacts answer: size: 30 kB [2019-05-06 10:57:44.174][Debug][peer.go:1640] [OpMon] fetched 470 initial contacts objects [2019-05-06 10:57:44.177][Debug][peer.go:1231] [OpMon] got contactgroups answer: size: 6 kB [2019-05-06 10:57:44.178][Debug][peer.go:1640] [OpMon] fetched 18 initial contactgroups objects [2019-05-06 10:57:44.181][Debug][peer.go:1231] [OpMon] got commands answer: size: 71 kB [2019-05-06 10:57:44.183][Debug][peer.go:1640] [OpMon] fetched 546 initial commands objects [2019-05-06 10:57:44.823][Debug][peer.go:1231] [OpMon] got hosts answer: size: 5687 kB [2019-05-06 10:57:45.051][Debug][peer.go:1640] [OpMon] fetched 1343 initial hosts objects [2019-05-06 10:57:45.068][Debug][peer.go:1231] [OpMon] got hostgroups answer: size: 11 kB [2019-05-06 10:57:45.069][Debug][peer.go:1640] [OpMon] fetched 13 initial hostgroups objects [2019-05-06 10:57:57.758][Debug][peer.go:1231] [OpMon] got services answer: size: 117358 kB [2019-05-06 10:58:02.078][Debug][peer.go:1640] [OpMon] fetched 31793 initial services objects [2019-05-06 10:58:02.287][Debug][peer.go:1231] [OpMon] got servicegroups answer: size: 3 kB [2019-05-06 10:58:02.288][Debug][peer.go:1640] [OpMon] fetched 8 initial servicegroups objects [2019-05-06 10:58:02.732][Debug][peer.go:1231] [OpMon] got comments answer: size: 12484 kB [2019-05-06 10:58:03.075][Debug][peer.go:1640] [OpMon] fetched 88359 initial comments objects [2019-05-06 10:58:03.490][Debug][peer.go:1231] [OpMon] got downtimes answer: size: 0 kB [2019-05-06 10:58:03.490][Debug][peer.go:1640] [OpMon] fetched 0 initial downtimes objects [2019-05-06 10:58:03.596][Info][peer.go:721] [OpMon] objects created in: 19.433867867s
great :-)
Hey,
I have a naemon 1.0.8 + livestatus running with 1500 hosts and 32k services. Livestatus is fast, the interface via livestatus is very fast, but LMD takes 8m to read all objects on startup. Both the interface and LMD are using livestatus via xinetd. Any tips on how I could debug this or improve its performance?
Tks.