Closed alexharpin closed 4 years ago
I got a similar issue: "View History For This Host" in Thruk with a larger time frame crashes Naemon. 1 week was enough for a server with many changes in the logfile. The largest json result without crash was 14855 bytes in size.
When the "Sort:" line is removed from the query it works.
Naemon generates a segmentation fault here: https://github.com/naemon/naemon-livestatus/blob/16d932d47acc3d832c7c639b1daea85f0a507b0b/src/Query.cc#L1232
The pointer address "0x6d6f5a006e616c2e" is a part of the query answer: ".lan\x00Zom"
naemon-core 1.0.8 naemon-livestatus 1.0.8 thruk 2.22
Can anyone tell me how this progresses as I am having to tell my users to not do searches and is looking kinda rubbish for the platform...?
well, i could remove the sort from the query in thruk till this issue is resolved.
Many thanks for your reply. I don't really know how removing the sort from query would impact anything else in the system, but if its acceptable workaround for all then I am happy with it. If you can tell me also how to run traces/debugs for Naemon in Ubuntu 16.04 that would be great as I can send that information in also.
It just that Naemon has built up such a good reputation in the company that I've kinda got protective about keeping it up there. As always, thanks to everyone that makes it such a good product.
Should be better soon, i removed the sort header from log livestatus queries in Thruk: https://github.com/sni/Thruk/commit/d9d98ca1919b3178d4d292b06e389134817708f6
i will keep this issue open, since the real cause has not been solved yet.
Could this be related to https://github.com/naemon/naemon-livestatus/pull/73 ?
I just started getting this crash today, while migrating to CentOS 8 This query GET hosts Columns: host_name Stats: host_state = 1 Stats: childs != StatsAnd: 2 OutputFormat: json ResponseHeader: fixed16 ColumnHeaders: on
generate this crash on gdb (gdb) where
from /usr/lib64/naemon/naemon-livestatus/livestatus.so
from /usr/lib64/naemon/naemon-livestatus/livestatus.so
The same versions running on CentOS 6 do not crash with the same queries. It crashed with my own compiled naemon daemon and livestatus and with the CentOS 8 RPMs from naemon.org
This patch #73 fixed the problem above, just tested it on CentOS 8 just now.
great, so we can close this one as well
After upgrading from naemon 1.0.7 and Thruk 2.19 (long overdue upgrade on my part), the naemon process would terminate with a SIGABRT when access status information from Thruk. Some sections would display the information as expected while others would report "connection refused" for the livestatus socket (obviously due to the crash above). Tracked the issue down to when livestatus is sent a query containing a sort option. Capture the query sent just before the crash, removing the sort options and sending it again (unixcat to the livestatus socket) resulted in data being returned, adding the sort options back in resulted in a backend process crash. Not sure if this is a livestatus or naemon-core issue.
Current versions
Naemon = 1.0.8 Thruk = 2.22
Query that causes crash (not just limited to this one, any with sort seem to be an issue)
GET services Columns: accept_passive_checks acknowledged action_url action_url_expanded active_checks_enabled check_command check_interval check_options check_period check_type checks_enabled comments current_attempt current_notification_number description event_handler event_handler_enabled custom_variable_names custom_variable_values execution_time first_notification_delay flap_detection_enabled groups has_been_checked high_flap_threshold host_acknowledged host_action_url_expanded host_active_checks_enabled host_address host_alias host_checks_enabled host_check_type host_latency host_plugin_output host_perf_data host_current_attempt host_check_command host_comments host_groups host_has_been_checked host_icon_image_expanded host_icon_image_alt host_is_executing host_is_flapping host_name host_notes_url_expanded host_notifications_enabled host_scheduled_downtime_depth host_state host_accept_passive_checks host_last_state_change icon_image icon_image_alt icon_image_expanded is_executing is_flapping last_check last_notification last_state_change latency low_flap_threshold max_check_attempts next_check notes notes_expanded notes_url notes_url_expanded notification_interval notification_period notifications_enabled obsess_over_service percent_state_change perf_data plugin_output process_performance_data retry_interval scheduled_downtime_depth state state_type modified_attributes_list last_time_critical last_time_ok last_time_unknown last_time_warning display_name host_display_name host_custom_variable_names host_custom_variable_values in_check_period in_notification_period host_parents long_plugin_output Filter: host_name = .**.* Sort: host_name asc Sort: description asc** OutputFormat: wrapped_json ResponseHeader: fixed16
Same query without the sort(s) work fine.
GET services Columns: accept_passive_checks acknowledged action_url action_url_expanded active_checks_enabled check_command check_interval check_options check_period check_type checks_enabled comments current_attempt current_notification_number description event_handler event_handler_enabled custom_variable_names custom_variable_values execution_time first_notification_delay flap_detection_enabled groups has_been_checked high_flap_threshold host_acknowledged host_action_url_expanded host_active_checks_enabled host_address host_alias host_checks_enabled host_check_type host_latency host_plugin_output host_perf_data host_current_attempt host_check_command host_comments host_groups host_has_been_checked host_icon_image_expanded host_icon_image_alt host_is_executing host_is_flapping host_name host_notes_url_expanded host_notifications_enabled host_scheduled_downtime_depth host_state host_accept_passive_checks host_last_state_change icon_image icon_image_alt icon_image_expanded is_executing is_flapping last_check last_notification last_state_change latency low_flap_threshold max_check_attempts next_check notes notes_expanded notes_url notes_url_expanded notification_interval notification_period notifications_enabled obsess_over_service percent_state_change perf_data plugin_output process_performance_data retry_interval scheduled_downtime_depth state state_type modified_attributes_list last_time_critical last_time_ok last_time_unknown last_time_warning display_name host_display_name host_custom_variable_names host_custom_variable_values in_check_period in_notification_period host_parents long_plugin_output Filter: host_name = .**.*** OutputFormat: wrapped_json ResponseHeader: fixed16