opinkerfi / adagios

Adagios - Web Based Nagios Configuration
GNU Affero General Public License v3.0
330 stars 75 forks source link

BUG Report: Downtime and Comments not able to parse livestatus output #643

Open tjyang opened 6 years ago

tjyang commented 6 years ago
  File "/usr/lib/python2.7/site-packages/adagios/views.py", line 43, in wrapper
    result = view_func(request, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/adagios/status/views.py", line 971, in downtime_list
    c['downtimes'] = l.query('GET downtimes', *args)
  File "/usr/lib/python2.7/site-packages/pynag/Parsers/multisite.py", line 80, in query
    query_result = backend_instance.query(query, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/pynag/Parsers/livestatus.py", line 996, in query
    raise InvalidResponseFromLivestatus(query=livestatus_query, response=response_data)
InvalidResponseFromLivestatus: Could not parse response from livestatus.
Query:GET downtimes
ResponseHeader: fixed16
OutputFormat: python
ColumnHeaders: on

Response: [[u"author",u"comment",u"duration",u"end_time",u"entry_time",u"fixed",u"host_accept_passive_checks",u"host_acknowledged",u"host_acknowledgement_type",u"host_action_url",u"host_action_url_expanded",u"host_active_checks_enabled",u"host_address",u"host_alias",u"host_check_command",u"host_check_command_expanded",u"host_check_flapping_recovery_notification",u"host_check_freshness",u"host_check_interval",u"host_check_options",u"host_check_period",u"host_check_type",u"host_checks_enabled",u"host_childs",u"host_comments",u"host_comments_with_extra_info",u"host_comments_with_info",u"host_contact_groups",u"host_contacts",u"host_current_attempt",u"host_current_notification_number",u"host_custom_variable_names",u"host_custom_variable_values",u"host_custom_variables",u"host_display_name",u"host_downtimes",u"host_downtimes_with_info",u"host_event_handler",u"host_event_handler_enabled",u"host_execution_time",u"host_filename",u"host_first_notification_delay",u"host_flap_detection_enabled",u"host_groups",u"host_hard_state",u"host_has_been_checked",u"host_high_flap_threshold",u"host_icon_image",u"host_icon_image_alt",u"host_icon_image_expanded",u"host_in_check_period",u"host_in_notification_period",u"host_in_service_period",u"host_initial_state",u"host_is_executing",u"host_is_flapping",u"host_last_check",u"host_last_hard_state",u"host_last_hard_state_change",u"host_last_notification",u"host_last_state",u"host_last_state_change",u"host_last_time_down",u"host_last_time_unreachable",u"host_last_time_up",u"host_latency",u"host_long_plugin_output",u"host_low_flap_threshold",u"host_max_check_attempts",u"host_metrics",u"host_mk_inventory",u"host_mk_inventory_gz",u"host_mk_inventory_last",u"host_modified_attributes",u"host_modified_attributes_list",u"host_name",u"host_next_check",u"host_next_notification",u"host_no_more_notifications",u"host_notes",u"host_notes_expanded",u"host_notes_url",u"host_notes_url_expanded",u"host_notification_interval",u"host_notification_period",u"host_notifications_enabled",u"host_num_services",u"host_num_services_crit",u"host_num_services_hard_crit",u"host_num_services_hard_ok",u"host_num_services_hard_unknown",u"host_num_services_hard_warn",u"host_num_services_ok",u"host_num_services_pending",u"host_num_services_unknown",u"host_num_services_warn",u"host_obsess_over_host",u"host_parents",u"host_pending_flex_downtime",u"host_percent_state_change",u"host_perf_data",u"host_plugin_output",u"host_pnpgraph_present",u"host_process_performance_data",u"host_retry_interval",u"host_scheduled_downtime_depth",u"host_service_period"
Mjolinir commented 6 years ago

I can confirm this issue still exists and seems to be on the Adagios side:

Centos 7.4 (and Centos 7.5) adagios-1.6.3-1 nagios-4.3.4-5 check-mk-livestatus-1.2.8p26-1

Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/adagios/views.py", line 43, in wrapper result = view_func(request, *args, **kwargs) File "/usr/lib/python2.7/site-packages/adagios/status/views.py", line 971, in downtime_list c['downtimes'] = l.query('GET downtimes', *args) File "/usr/lib/python2.7/site-packages/pynag/Parsers/multisite.py", line 80, in query query_result = backend_instance.query(query, *args, **kwargs) File "/usr/lib/python2.7/site-packages/pynag/Parsers/livestatus.py", line 996, in query raise InvalidResponseFromLivestatus(query=livestatus_query, response=response_data) InvalidResponseFromLivestatus: Could not parse response from livestatus. Query:GET downtimes ResponseHeader: fixed16 OutputFormat: python ColumnHeaders:on``

livestatus is indeed loaded and working, I can verify it with the following:

echo 'GET hosts' | unixcat /var/spool/nagios/cmd/livestatus

and also via the following in the logs:

livestatus: Livestatus 1.2.8p26 by Mathias Kettner. Socket: '/var/spool/nagios/cmd/livestatus' livestatus: Please visit us at http://mathias-kettner.de/ livestatus: Hint: please try out OMD - the Open Monitoring Distribution livestatus: Please visit OMD at http://omdistro.org livestatus: Finished initialization. Further log messages go to /var/log/nagios/livestatus.log Event broker module '/usr/lib64/check_mk/livestatus.o' initialized successfully.

gardart commented 6 years ago

thank you @tjyang and @Mjolinir what version of Pynag are you using?

Mjolinir commented 6 years ago

Hello gardart! I hope this is something that can be fixed relatively easy. It has been broken for some time.

For me it looks to be: pynag-0.9.1-1

Please let me know anything else I can do to help

gardart commented 6 years ago

@Mjolinir and @tjyang could you try to update to the latest pynag and adagios (released last week), using yum --enablerepo=ok-testing update pynag adagios let me know if this solves this issue

tjyang commented 6 years ago

@Mjolinir , I updated the new rpms on my test nagios instance, it didn't help. Can you confirm ?

After "yum --enablerepo=ok-testing update pynag adagios"
[me@nagios03 ~]$ rpm -qa |egrep 'adagio|pynag'
pynag-0.9.1-1.git.187.9bcf9ed.el7.noarch
adagios-1.6.3-2.git.0.4290a53.el7.noarch
[me@ilclnagios03 ~]$

Oh no, something went wrong ☹
InvalidResponseFromLivestatus: Could not parse response from livestatus. Query:GET downtimes ResponseHeader: fixed16 OutputFormat: python ColumnHeaders: on Response: 
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/adagios/views.py", line 43, in wrapper
    result = view_func(request, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/adagios/status/views.py", line 971, in downtime_list
    c['downtimes'] = l.query('GET downtimes', *args)
  File "/usr/lib/python2.7/site-packages/pynag/Parsers/multisite.py", line 80, in query
    query_result = backend_instance.query(query, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/pynag/Parsers/livestatus.py", line 996, in query
    raise InvalidResponseFromLivestatus(query=livestatus_query, response=response_data)
InvalidResponseFromLivestatus: Could not parse response from livestatus.
Query:GET downtimes
ResponseHeader: fixed16
OutputFormat: python
ColumnHeaders: on
gardart commented 6 years ago

@tjyang could you try to add this to your livestatus broker in /etc/nagios/nagios.cfg debug=1 query_timeout=0

tjyang commented 6 years ago
* /var/log/nagios/livestatus.log

[root@inagios03 nagios]# tail -40 /var/log/nagios/livestatus.log 2018-05-23 10:20:55 Query: ResponseHeader: fixed16 2018-05-23 10:20:55 Time to process request: 12 us. Size of answer: 36 bytes 2018-05-23 10:20:56 Query: GET hosts 2018-05-23 10:20:56 Query: Stats: state >= 0 2018-05-23 10:20:56 Query: Stats: state > 0 2018-05-23 10:20:56 Query: Stats: scheduled_downtime_depth = 0 2018-05-23 10:20:56 Query: Stats: hard_state >= 1 2018-05-23 10:20:56 Query: StatsAnd: 3 2018-05-23 10:20:56 Query: Stats: state > 0 2018-05-23 10:20:56 Query: Stats: scheduled_downtime_depth = 0 2018-05-23 10:20:56 Query: Stats: acknowledged = 0 2018-05-23 10:20:56 Query: Stats: hard_state >= 1 2018-05-23 10:20:56 Query: StatsAnd: 4 2018-05-23 10:20:56 Query: Filter: custom_variable_names < _REALNAME 2018-05-23 10:20:56 Query: Localtime: 1527085256 2018-05-23 10:20:56 Query: OutputFormat: python 2018-05-23 10:20:56 Query: KeepAlive: on 2018-05-23 10:20:56 Query: ResponseHeader: fixed16 2018-05-23 10:20:56 Time to process request: 856 us. Size of answer: 13 bytes 2018-05-23 10:20:56 Query: GET services 2018-05-23 10:20:56 Query: Stats: state >= 0 2018-05-23 10:20:56 Query: Stats: state > 0 2018-05-23 10:20:56 Query: Stats: scheduled_downtime_depth = 0 2018-05-23 10:20:56 Query: Stats: host_scheduled_downtime_depth = 0 2018-05-23 10:20:56 Query: Stats: host_state = 0 2018-05-23 10:20:56 Query: Stats: last_hard_state >= 1 2018-05-23 10:20:56 Query: StatsAnd: 5 2018-05-23 10:20:56 Query: Stats: state > 0 2018-05-23 10:20:56 Query: Stats: scheduled_downtime_depth = 0 2018-05-23 10:20:56 Query: Stats: host_scheduled_downtime_depth = 0 2018-05-23 10:20:56 Query: Stats: acknowledged = 0 2018-05-23 10:20:56 Query: Stats: host_state = 0 2018-05-23 10:20:56 Query: Stats: last_hard_state >= 1 2018-05-23 10:20:56 Query: StatsAnd: 6 2018-05-23 10:20:56 Query: Filter: host_custom_variable_names < _REALNAME 2018-05-23 10:20:56 Query: Localtime: 1527085256 2018-05-23 10:20:56 Query: OutputFormat: python 2018-05-23 10:20:56 Query: KeepAlive: on 2018-05-23 10:20:56 Query: ResponseHeader: fixed16 2018-05-23 10:20:56 Time to process request: 7114 us. Size of answer: 18 bytes [root@nagios03 nagios]#

Mjolinir commented 6 years ago

Looks very similar for me:

Updated to the new packages from ok-testing. Problem still exists, unfortunately.

Debug: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/adagios/views.py", line 43, in wrapper result = view_func(request, *args, **kwargs) File "/usr/lib/python2.7/site-packages/adagios/status/views.py", line 959, in comment_list c['comments'] = l.query('GET comments', *args) File "/usr/lib/python2.7/site-packages/pynag/Parsers/multisite.py", line 80, in query query_result = backend_instance.query(query, *args, **kwargs) File "/usr/lib/python2.7/site-packages/pynag/Parsers/livestatus.py", line 996, in query raise InvalidResponseFromLivestatus(query=livestatus_query, response=response_data) InvalidResponseFromLivestatus: Could not parse response from livestatus. Query:GET comments ResponseHeader: fixed16 OutputFormat: python ColumnHeaders: on

Error msg:

`InvalidResponseFromLivestatus: Could not parse response from livestatus. Query:GET downtimes ResponseHeader: fixed16 OutputFormat: python ColumnHeaders: on Response: [[u"author",u"comment",u"duration",u"end_time",u"entry_time",u"fixed",u"host_accept_passive_checks",u"host_acknowledged",u"host_acknowledgement_type",u"host_action_url",u"host_action_url_expanded",u"host_active_checks_enabled",u"host_address",u"host_alias",u"host_check_command",u"host_check_command_expanded",u"host_check_flapping_recovery_notification",u"host_check_freshness",u"host_check_interval",u"host_check_options",u"host_check_period",u"host_check_type",u"host_checks_enabled",u"host_childs",u"host_comments",u"host_comments_with_extra_info",u"host_comments_with_info",u"host_contact_groups",u"host_contacts",u"host_current_attempt",u"host_current_notification_number",u"host_custom_variable_names",u"host_custom_variable_values",u"host_custom_variables",u"host_display_name",u"host_downtimes",u"host_downtimes_with_info",u"host_event_handler",u"host_event_handler_enabled",u"host_execution_time",u"host_filename",u"host_first_notification_delay",u"host_flap_detection_enabled",u"host_groups",u"host_hard_state",u"host_has_been_checked",u"host_high_flap_threshold",u"host_icon_image",u"host_icon_image_alt",u"host_icon_image_expanded",u"host_in_check_period",u"host_in_notification_period",u"host_in_service_period",u"host_initial_state",u"host_is_executing",u"host_is_flapping",u"host_last_check",u"host_last_hard_state",u"host_last_hard_state_change",u"host_last_notification",u"host_last_state",u"host_last_state_change",u"host_last_time_down",u"host_last_time_unreachable",u"host_last_time_up",u"host_latency",u"host_long_plugin_output",u"host_low_flap_threshold",u"host_max_check_attempts",u"host_metrics",u"host_mk_inventory",u"host_mk_inventory_gz",u"host_mk_inventory_last",u"host_modified_attributes",u"host_modified_attributes_list",u"host_name",u"host_next_check",u"host_next_notification",u"host_no_more_notifications",u"host_notes",u"host_notes_expanded",u"host_notes_url",u"host_notes_url_expanded",u"host_notification_interval",u"host_notification_period",u"host_notifications_enabled",u"host_num_services",u"host_num_services_crit",u"host_num_services_hard_crit",u"host_num_services_hard_ok",u"host_num_services_hard_unknown",u"host_num_services_hard_warn",u"host_num_services_ok",u"host_num_services_pending",u"host_num_services_unknown",u"host_num_services_warn",u"host_obsess_over_host",u"host_parents",u"host_pending_flex_downtime",u"host_percent_state_change",u"host_perf_data",u"host_plugin_output",u"host_pnpgraph_present",u"host_process_performance_data",u"host_retry_interval",u"host_scheduled_downtime_depth",u"host_service_period",u"host_services",u"host_services_with_fullstate",u"host_services_with_info",u"host_services_with_state",u"host_staleness",u"host_state",u"host_state_type",u"host_statusmap_image",u"host_total_services",u"host_worst_service_hard_state",u"host_worst_service_state",u"host_x_3d",u"host_y_3d",u"host_z_3d",u"id",u"is_service",u"service_accept_passive_checks",u"service_acknowledged",u"service_acknowledgement_type",u"service_action_url",u"service_action_url_expanded",u"service_active_checks_enabled",u"service_cache_interval",u"service_cached_at",u"service_check_command",u"service_check_command_expanded",u"service_check_freshness",u"service_check_interval",u"service_check_options",u"service_check_period",u"service_check_type",u"service_checks_enabled",u"service_comments",u"service_comments_with_extra_info",u"service_comments_with_info",u"service_contact_groups",u"service_contacts",u"service_current_attempt",u"service_current_notification_number",u"service_custom_variable_names",u"service_custom_variable_values",u"service_custom_variables",u"service_description",u"service_display_name",u"service_downtimes",u"service_downtimes_with_info",u"service_event_handler",u"service_event_handler_enabled",u"service_execution_time",u"service_first_notification_delay",u"service_flap_detection_enabled",u"service_groups",u"service_has_been_checked",u"service_high_flap_threshold",u"service_icon_image",u"service_icon_image_alt",u"service_icon_image_expanded",u"service_in_check_period",u"service_in_notification_period",u"service_in_service_period",u"service_initial_state",u"service_is_executing",u"service_is_flapping",u"service_last_check",u"service_last_hard_state",u"service_last_hard_state_change",u"service_last_notification",u"service_last_state",u"service_last_state_change",u"service_last_time_critical",u"service_last_time_ok",u"service_last_time_unknown",u"service_last_time_warning",u"service_latency",u"service_long_plugin_output",u"service_low_flap_threshold",u"service_max_check_attempts",u"service_metrics",u"service_modified_attributes",u"service_modified_attributes_list",u"service_next_check",u"service_next_notification",u"service_no_more_notifications",u"service_notes",u"service_notes_expanded",u"service_notes_url",u"service_notes_url_expanded",u"service_notification_interval",u"service_notification_period",u"service_notifications_enabled",u"service_obsess_over_service",u"service_percent_state_change",u"service_perf_data",u"service_plugin_output",u"service_pnpgraph_present",u"service_process_performance_data",u"service_retry_interval",u"service_scheduled_downtime_depth",u"service_service_period",u"service_staleness",u"service_state",u"service_state_type",u"start_time",u"triggered_by",u"type"]

....

1527163154,0,0,u"",u"",u"",u"",6.0000000000e+01,u"24x7_except_maintenance",1,0,0,0,0,0,0,0,0,0,0,1,[],0,0.0000000000e+00,u"",u"(Host check timed out after 30.10 seconds)",-1,1,1.0000000000e+00,1,u"",[],[],[],[],1.1466666667e+00,1,1,u"",0,0,0,0.0000000000e+00,0.0000000000e+00,0.0000000000e+00,177,0,0,0,0,u"",u"",0,0,0,u"",u"",0,0.0000000000e+00,0,u"",0,0,[],[],[],[],[],0,0,[],[],{},u"",u"",[],[],u"",0,0.0000000000e+00,0.0000000000e+00,0,[],0,0.0000000000e+00,u"",u"",u"",0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0000000000e+00,u"",0.0000000000e+00,0,,0,[],0,0,0,u"",u"",u"",u"",0.0000000000e+00,u"",0,0,0.0000000000e+00,u"",u"",0,0,0.0000000000e+00,0,u"",0.0000000000e+00,0,0,1516214153,0,2]] `

tail -50 /var/log/nagios/livestatus.log 2018-05-24 07:59:25 Query: GET hosts 2018-05-24 07:59:25 Query: ResponseHeader: fixed16 2018-05-24 07:59:25 Query: OutputFormat: python 2018-05-24 07:59:25 Query: ColumnHeaders: on 2018-05-24 07:59:25 Time to process request: 6587 us. Size of answer: 164445 bytes 2018-05-24 07:59:25 Time to process request: 5982 us. Size of answer: 164445 bytes 2018-05-24 07:59:25 Query: GET services 2018-05-24 07:59:25 Query: Filter: state != 0 2018-05-24 07:59:25 Query: Filter: acknowledged = 0 2018-05-24 07:59:25 Query: Filter: host_acknowledged = 0 2018-05-24 07:59:25 Query: Filter: scheduled_downtime_depth = 0 2018-05-24 07:59:25 Query: Filter: host_scheduled_downtime_depth = 0 2018-05-24 07:59:25 Query: Stats: state != 0 2018-05-24 07:59:25 Query: Stats: host_state != 0 2018-05-24 07:59:25 Query: ResponseHeader: fixed16 2018-05-24 07:59:25 Query: OutputFormat: python 2018-05-24 07:59:25 Query: ColumnHeaders: off 2018-05-24 07:59:25 Time to process request: 37 us. Size of answer: 8 bytes 2018-05-24 07:59:25 Query: GET services 2018-05-24 07:59:25 Query: Stats: state != 0 2018-05-24 07:59:25 Query: Stats: state != 0 2018-05-24 07:59:25 Query: Stats: acknowledged = 0 2018-05-24 07:59:25 Query: Stats: scheduled_downtime_depth = 0 2018-05-24 07:59:25 Query: Stats: host_state = 0 2018-05-24 07:59:25 Query: StatsAnd: 4 2018-05-24 07:59:25 Query: ResponseHeader: fixed16 2018-05-24 07:59:25 Query: OutputFormat: python 2018-05-24 07:59:25 Query: ColumnHeaders: off 2018-05-24 07:59:25 Time to process request: 57 us. Size of answer: 8 bytes 2018-05-24 07:59:25 Query: GET hosts 2018-05-24 07:59:25 Query: Stats: state != 0 2018-05-24 07:59:25 Query: Stats: state != 0 2018-05-24 07:59:25 Query: Stats: acknowledged = 0 2018-05-24 07:59:25 Query: Stats: scheduled_downtime_depth = 0 2018-05-24 07:59:25 Query: Stats: host_state = 1 2018-05-24 07:59:25 Query: StatsAnd: 4 2018-05-24 07:59:25 Query: ResponseHeader: fixed16 2018-05-24 07:59:25 Query: OutputFormat: python 2018-05-24 07:59:25 Query: ColumnHeaders: off 2018-05-24 07:59:25 Time to process request: 56 us. Size of answer: 8 bytes 2018-05-24 07:59:25 Query: GET hosts 2018-05-24 07:59:25 Query: ResponseHeader: fixed16 2018-05-24 07:59:25 Query: OutputFormat: python 2018-05-24 07:59:25 Query: ColumnHeaders: on 2018-05-24 07:59:25 Time to process request: 6385 us. Size of answer: 164445 bytes 2018-05-24 07:59:25 Query: GET hosts 2018-05-24 07:59:25 Query: ResponseHeader: fixed16 2018-05-24 07:59:25 Query: OutputFormat: python 2018-05-24 07:59:25 Query: ColumnHeaders: on 2018-05-24 07:59:25 Time to process request: 6306 us. Size of answer: 164445 bytes

Mjolinir commented 6 years ago

One thing I noticed, not sure if it is relevant,

Im using check-mk-livestatus-1.2.8p26-1.el7 from EPEL. If I use mk-livestatus-1.2.2-3.git.2.27fc0fd.el7.centos.x86_64 from ok-testing then livestatus does not work at all.

tjyang commented 6 years ago
gardart commented 6 years ago

check-mk-livestatus-1.2.8p26-1.el7 from EPEL is the correct one...

tjyang commented 6 years ago

@gardart I am using check-mk-livestatus-1.2.8p26-1.el7 from EPEL , Comments and Downtime still has issue. Looks like the adagios side of parser code need to be adjusted.

gardart commented 6 years ago

does your nagios server crash when this happens? Do you need to restart nagios service every time?

tjyang commented 6 years ago

No, both nagios and livestatus daemon weren't not crashed when this issue happened.

[root@nagios03 nagios]# tail -20f  /var/log/nagios/livestatus.log
2018-07-11 16:32:19 Idle timeout of 12000 ms exceeded. Going to close connection.
2018-07-11 16:32:19 error: Client connection terminated while request still incomplete
2018-07-11 16:32:21 Idle timeout of 12000 ms exceeded. Going to close connection.
2018-07-11 16:32:21 error: Client connection terminated while request still incomplete
2018-07-11 16:32:41 Idle timeout of 12000 ms exceeded. Going to close connection.
2018-07-11 16:32:41 error: Client connection terminated while request still incomplete
2018-07-11 16:32:48 Idle timeout of 12000 ms exceeded. Going to close connection.
2018-07-11 16:32:48 error: Client connection terminated while request still incomplete
2018-07-11 20:01:04 deinitializing
2018-07-11 20:01:04 Waiting for main to terminate...
2018-07-11 20:01:04 Waiting for client threads to terminate...
2018-07-11 20:01:04 Logfile cache: flushing complete cache.
2018-07-12 00:01:03 deinitializing
2018-07-12 00:01:03 Waiting for main to terminate...
2018-07-12 00:01:05 Waiting for client threads to terminate...
2018-07-12 00:01:05 Logfile cache: flushing complete cache.
2018-07-12 04:01:04 deinitializing
2018-07-12 04:01:04 Waiting for main to terminate...
2018-07-12 04:01:06 Waiting for client threads to terminate...
2018-07-12 04:01:06 Logfile cache: flushing complete cache.
^C
[root@nagios03 nagios]# date
Thu Jul 12 07:01:13 EDT 2018
[root@nagios03 nagios]#
Mjolinir commented 6 years ago

Same applies to me. no crashes.

Mjolinir commented 6 years ago

I noticed today that both Comments and Downtime are working! Unfortunately I am not sure which update fixed it. Here are the current versions of related packages:

check-mk-livestatus-1.4.0p31-2.el7.x86_64 (last updated June 21) pynag-0.9.1-1.git.187.9bcf9ed.el7.noarch (last updated May 24) adagios-1.6.3-2.git.0.4290a53.el7.noarch (last updated May 24) nagios-4.3.4-5.el7.x86_64 (last updated Apr 16)

It seems likely it was the check-mk-livestatus update in June and I just didn't notice - the updates are automated with Ansible

@tjyang can you confirm on your end?

tjyang commented 6 years ago

Thanks to @Mjolinir's pointer and @gardart's help.

tjyang commented 6 years ago
gardart commented 6 years ago

I tried two different versions of mk-livestatus, 1.2.6 and 1.2.8. 1.2.6 still works with Nagios4 but 1.2.8 gives parse errors in downtime and comments view. mk-livestatus works best when using Naemon as the Nagios server. You can install Adagios on top of Naemon as well.

Here is the current workaround for Nagios4: You can build 1.2.6 with nagios4 like this

yum remove check-mk wget http://www.mathias-kettner.de/download/mk-livestatus-1.2.6.tar.gz yum install -y make gcc-c++ tar -zxvf mk-livestatus-1.2.6.tar.gz cd mk-livestatus-1.2.6 ./configure --with-nagios4 make make install

Then use this in your broker_module settings broker_module=/usr/local/lib/mk-livestatus/livestatus.o /var/spool/nagios/cmd/livestatus

tjyang commented 5 years ago
[root@nagios03 ~]# cat /etc/redhat-release; rpm -qa |egrep 'check-mk-livestatus-1|pynag-0|adagios-1|nagios-4';date
CentOS Linux release 7.6.1810 (Core)
pynag-0.9.1-1.git.187.9bcf9ed.el7.noarch
adagios-1.6.3-2.git.0.4290a53.el7.noarch
check-mk-livestatus-1.4.0p31-2.el7.x86_64
nagios-4.4.3-1.el7.x86_64
Wed Sep  4 15:17:30 EDT 2019
[root@nagios03 ~]#
tjyang commented 5 years ago

@gardart check-mk-livestatus-1.4.0p31-2.el7.x86_64 fixed my comment/downtime display issue but it will crash my nagios server due to livestatus aborted when doing LQL 'GET hosts' command. I tried compiling version from 1.2.8 up to latest 1.6 , they all crashed nagios server when doing GET hosts. so I followed your tip above, using version 1.2.6 and now both 'GET hosts' and "comment/downtime" all works. Thanks again for your pointer.