influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.6k stars 5.57k forks source link

Zookeeper plugin doesn't read mntr's NaNs #11520

Closed dmytronasyrov closed 2 years ago

dmytronasyrov commented 2 years ago

Relevant telegraf.conf

[[inputs.zookeeper]]
      servers = ["zookeeper-0.zookeeper.zookeeper-ns.svc.cluster.local:2181"]
      timeout = "5s"
      insecure_skip_verify = true

Logs from Telegraf

E! [inputs.zookeeper] Error in plugin: unexpected line in mntr response: "zk_close_session_prep_time{quantile=\"0.5\"}\tNaN"

System info

Telegraf 1.23.1, Zookeeper 3.8.0

Docker

No response

Steps to reproduce

  1. Just run it

Expected behavior

Plugin should parse the line zk_close_session_prep_time{quantile="0.5"} NaN

Actual behavior

Plugin is unable to parse the line from mntr zk_close_session_prep_time{quantile="0.5"} NaN

Additional info

No response

powersj commented 2 years ago

Hi,

Just run it

Happy to help, but this is not helpful at all. At the very least, it would be nice to confirm how you set up and ran zookeeper. Default config? Do you have anything special setup? Is this hosted in the cloud?

unexpected line in mntr response

This error comes from this check in the code, where it checks to see if the regular expression came back with 3 parts. The regular expression does not appear to match anything from that line you provided.

I do not think NaN is the issue here. Instead, I think it has to do with the regex not matching on the quantile expression.

Can you get the output of $ echo mntr | nc localhost 2181, while replacing localhost with your server so I can see what all metrics you are getting?

Thanks!

dmytronasyrov commented 2 years ago

The problem is in the first line with NaN. And as I can see from the code the reason is the line can't be regex-parsed. Maybe NaN is the issue.

Screenshot 2022-07-23 at 11 15 43

Zookeeper is 3.8.0 downloaded from apache.org and deployed into slim-bullseye. But I don't think this is useful information as the reason is the row that can't be parsed

powersj commented 2 years ago

@dmytronasyrov can you share what other configuration you are doing to zookeeper? I would like to understand what is causing those additional metrics to show up.

I launched a container (e.g. docker run --net=host --env ZOO_4LW_COMMANDS_WHITELIST="mntr" --rm zookeeper) with that same version, and I am not seeing those additional metrics:

❯ echo mntr | nc localhost 2181
zk_version  3.8.0-5a02a05eddb59aee6ac762f7ea82e92a68eb9c0f, built on 2022-02-25 08:49 UTC
zk_server_state standalone
zk_ephemerals_count 0
zk_num_alive_connections    1
zk_avg_latency  0.0
zk_outstanding_requests 0
zk_znode_count  5
zk_global_sessions  0
zk_non_mtls_remote_conn_count   0
zk_last_client_response_size    -1
zk_packets_sent 0
zk_packets_received 1
zk_max_client_response_size -1
zk_connection_drop_probability  0.0
zk_watch_count  0
zk_auth_failed_count    0
zk_min_latency  0
zk_max_file_descriptor_count    1048576
zk_approximate_data_size    44
zk_open_file_descriptor_count   101
zk_local_sessions   0
zk_uptime   15733
zk_max_latency  0
zk_outstanding_tls_handshake    0
zk_min_client_response_size -1
zk_non_mtls_local_conn_count    0
zk_proposal_count   0
zk_watch_bytes  0
zk_outstanding_changes_removed  0
zk_throttled_ops    0
zk_stale_requests_dropped   0
zk_large_requests_rejected  0
zk_insecure_admin_count 0
zk_connection_rejected  0
zk_cnxn_closed_without_zk_server_running    0
zk_sessionless_connections_expired  0
zk_looking_count    0
zk_dead_watchers_queued 0
zk_stale_requests   0
zk_connection_drop_count    0
zk_learner_proposal_received_count  0
zk_digest_mismatches_count  0
zk_dead_watchers_cleared    0
zk_response_packet_cache_hits   0
zk_bytes_received_count 4
zk_add_dead_watcher_stall_time  0
zk_request_throttle_wait_count  0
zk_requests_not_forwarded_to_commit_processor   0
zk_response_packet_cache_misses 0
zk_ensemble_auth_success    0
zk_prep_processor_request_queued    0
zk_learner_commit_received_count    0
zk_stale_replies    0
zk_connection_request_count 0
zk_response_bytes   0
zk_ensemble_auth_fail   0
zk_diff_count   0
zk_response_packet_get_children_cache_misses    0
zk_connection_revalidate_count  0
zk_quit_leading_due_to_disloyal_voter   0
zk_snap_count   0
zk_unrecoverable_error_count    0
zk_unsuccessful_handshake   0
zk_commit_count 0
zk_stale_sessions_expired   0
zk_response_packet_get_children_cache_hits  0
zk_sync_processor_request_queued    0
zk_outstanding_changes_queued   0
zk_request_commit_queued    0
zk_ensemble_auth_skip   0
zk_skip_learner_request_to_next_processor_count 0
zk_tls_handshake_exceeded   0
zk_revalidate_count 0
zk_avg_socket_closing_time  0.0
zk_min_socket_closing_time  0
zk_max_socket_closing_time  0
zk_cnt_socket_closing_time  0
zk_sum_socket_closing_time  0
zk_avg_proposal_process_time    0.0
zk_min_proposal_process_time    0
zk_max_proposal_process_time    0
zk_cnt_proposal_process_time    0
zk_sum_proposal_process_time    0
zk_avg_leader_unavailable_time  0.0
zk_min_leader_unavailable_time  0
zk_max_leader_unavailable_time  0
zk_cnt_leader_unavailable_time  0
zk_sum_leader_unavailable_time  0
zk_avg_node_created_watch_count 0.0
zk_min_node_created_watch_count 0
zk_max_node_created_watch_count 0
zk_cnt_node_created_watch_count 0
zk_sum_node_created_watch_count 0
zk_avg_session_queues_drained   0.0
zk_min_session_queues_drained   0
zk_max_session_queues_drained   0
zk_cnt_session_queues_drained   0
zk_sum_session_queues_drained   0
zk_avg_write_commit_proc_req_queued 0.0
zk_min_write_commit_proc_req_queued 0
zk_max_write_commit_proc_req_queued 0
zk_cnt_write_commit_proc_req_queued 0
zk_sum_write_commit_proc_req_queued 0
zk_avg_connection_token_deficit 0.0
zk_min_connection_token_deficit 0
zk_max_connection_token_deficit 0
zk_cnt_connection_token_deficit 0
zk_sum_connection_token_deficit 0
zk_avg_read_commit_proc_req_queued  0.0
zk_min_read_commit_proc_req_queued  0
zk_max_read_commit_proc_req_queued  0
zk_cnt_read_commit_proc_req_queued  0
zk_sum_read_commit_proc_req_queued  0
zk_avg_node_deleted_watch_count 0.0
zk_min_node_deleted_watch_count 0
zk_max_node_deleted_watch_count 0
zk_cnt_node_deleted_watch_count 0
zk_sum_node_deleted_watch_count 0
zk_avg_startup_txns_load_time   0.0
zk_min_startup_txns_load_time   0
zk_max_startup_txns_load_time   0
zk_cnt_startup_txns_load_time   0
zk_sum_startup_txns_load_time   0
zk_avg_sync_processor_queue_size    0.0
zk_min_sync_processor_queue_size    0
zk_max_sync_processor_queue_size    0
zk_cnt_sync_processor_queue_size    1
zk_sum_sync_processor_queue_size    0
zk_avg_follower_sync_time   0.0
zk_min_follower_sync_time   0
zk_max_follower_sync_time   0
zk_cnt_follower_sync_time   0
zk_sum_follower_sync_time   0
zk_avg_prep_processor_queue_size    0.0
zk_min_prep_processor_queue_size    0
zk_max_prep_processor_queue_size    0
zk_cnt_prep_processor_queue_size    1
zk_sum_prep_processor_queue_size    0
zk_avg_fsynctime    0.0
zk_min_fsynctime    0
zk_max_fsynctime    0
zk_cnt_fsynctime    0
zk_sum_fsynctime    0
zk_avg_inflight_snap_count  0.0
zk_min_inflight_snap_count  0
zk_max_inflight_snap_count  0
zk_cnt_inflight_snap_count  0
zk_sum_inflight_snap_count  0
zk_avg_reads_issued_from_session_queue  0.0
zk_min_reads_issued_from_session_queue  0
zk_max_reads_issued_from_session_queue  0
zk_cnt_reads_issued_from_session_queue  0
zk_sum_reads_issued_from_session_queue  0
zk_avg_learner_request_processor_queue_size 0.0
zk_min_learner_request_processor_queue_size 0
zk_max_learner_request_processor_queue_size 0
zk_cnt_learner_request_processor_queue_size 0
zk_sum_learner_request_processor_queue_size 0
zk_avg_snapshottime 0.0
zk_min_snapshottime 0
zk_max_snapshottime 0
zk_cnt_snapshottime 1
zk_sum_snapshottime 0
zk_avg_unavailable_time 0.0
zk_min_unavailable_time 0
zk_max_unavailable_time 0
zk_cnt_unavailable_time 0
zk_sum_unavailable_time 0
zk_avg_startup_txns_loaded  0.0
zk_min_startup_txns_loaded  0
zk_max_startup_txns_loaded  0
zk_cnt_startup_txns_loaded  0
zk_sum_startup_txns_loaded  0
zk_avg_reads_after_write_in_session_queue   0.0
zk_min_reads_after_write_in_session_queue   0
zk_max_reads_after_write_in_session_queue   0
zk_cnt_reads_after_write_in_session_queue   0
zk_sum_reads_after_write_in_session_queue   0
zk_avg_requests_in_session_queue    0.0
zk_min_requests_in_session_queue    0
zk_max_requests_in_session_queue    0
zk_cnt_requests_in_session_queue    0
zk_sum_requests_in_session_queue    0
zk_avg_write_commit_proc_issued 0.0
zk_min_write_commit_proc_issued 0
zk_max_write_commit_proc_issued 0
zk_cnt_write_commit_proc_issued 0
zk_sum_write_commit_proc_issued 0
zk_avg_prep_process_time    0.0
zk_min_prep_process_time    0
zk_max_prep_process_time    0
zk_cnt_prep_process_time    0
zk_sum_prep_process_time    0
zk_avg_pending_session_queue_size   0.0
zk_min_pending_session_queue_size   0
zk_max_pending_session_queue_size   0
zk_cnt_pending_session_queue_size   0
zk_sum_pending_session_queue_size   0
zk_avg_time_waiting_empty_pool_in_commit_processor_read_ms  0.0
zk_min_time_waiting_empty_pool_in_commit_processor_read_ms  0
zk_max_time_waiting_empty_pool_in_commit_processor_read_ms  0
zk_cnt_time_waiting_empty_pool_in_commit_processor_read_ms  0
zk_sum_time_waiting_empty_pool_in_commit_processor_read_ms  0
zk_avg_commit_process_time  0.0
zk_min_commit_process_time  0
zk_max_commit_process_time  0
zk_cnt_commit_process_time  0
zk_sum_commit_process_time  0
zk_avg_dbinittime   3.0
zk_min_dbinittime   3
zk_max_dbinittime   3
zk_cnt_dbinittime   1
zk_sum_dbinittime   3
zk_avg_inflight_diff_count  0.0
zk_min_inflight_diff_count  0
zk_max_inflight_diff_count  0
zk_cnt_inflight_diff_count  0
zk_sum_inflight_diff_count  0
zk_avg_netty_queued_buffer_capacity 0.0
zk_min_netty_queued_buffer_capacity 0
zk_max_netty_queued_buffer_capacity 0
zk_cnt_netty_queued_buffer_capacity 0
zk_sum_netty_queued_buffer_capacity 0
zk_avg_election_time    0.0
zk_min_election_time    0
zk_max_election_time    0
zk_cnt_election_time    0
zk_sum_election_time    0
zk_avg_commit_commit_proc_req_queued    0.0
zk_min_commit_commit_proc_req_queued    0
zk_max_commit_commit_proc_req_queued    0
zk_cnt_commit_commit_proc_req_queued    0
zk_sum_commit_commit_proc_req_queued    0
zk_avg_sync_processor_batch_size    0.0
zk_min_sync_processor_batch_size    0
zk_max_sync_processor_batch_size    0
zk_cnt_sync_processor_batch_size    0
zk_sum_sync_processor_batch_size    0
zk_avg_node_children_watch_count    0.0
zk_min_node_children_watch_count    0
zk_max_node_children_watch_count    0
zk_cnt_node_children_watch_count    0
zk_sum_node_children_watch_count    0
zk_avg_write_batch_time_in_commit_processor 0.0
zk_min_write_batch_time_in_commit_processor 0
zk_max_write_batch_time_in_commit_processor 0
zk_cnt_write_batch_time_in_commit_processor 0
zk_sum_write_batch_time_in_commit_processor 0
zk_avg_read_commit_proc_issued  0.0
zk_min_read_commit_proc_issued  0
zk_max_read_commit_proc_issued  0
zk_cnt_read_commit_proc_issued  0
zk_sum_read_commit_proc_issued  0
zk_avg_concurrent_request_processing_in_commit_processor    0.0
zk_min_concurrent_request_processing_in_commit_processor    0
zk_max_concurrent_request_processing_in_commit_processor    0
zk_cnt_concurrent_request_processing_in_commit_processor    0
zk_sum_concurrent_request_processing_in_commit_processor    0
zk_avg_observer_sync_time   0.0
zk_min_observer_sync_time   0
zk_max_observer_sync_time   0
zk_cnt_observer_sync_time   0
zk_sum_observer_sync_time   0
zk_avg_node_changed_watch_count 0.0
zk_min_node_changed_watch_count 0
zk_max_node_changed_watch_count 0
zk_cnt_node_changed_watch_count 0
zk_sum_node_changed_watch_count 0
zk_avg_sync_process_time    0.0
zk_min_sync_process_time    0
zk_max_sync_process_time    0
zk_cnt_sync_process_time    0
zk_sum_sync_process_time    0
zk_avg_startup_snap_load_time   0.0
zk_min_startup_snap_load_time   0
zk_max_startup_snap_load_time   0
zk_cnt_startup_snap_load_time   1
zk_sum_startup_snap_load_time   0
zk_avg_prep_processor_queue_time_ms 0.0
zk_min_prep_processor_queue_time_ms 0
zk_max_prep_processor_queue_time_ms 0
zk_cnt_prep_processor_queue_time_ms 0
zk_sum_prep_processor_queue_time_ms 0
zk_p50_prep_processor_queue_time_ms 0
zk_p95_prep_processor_queue_time_ms 0
zk_p99_prep_processor_queue_time_ms 0
zk_p999_prep_processor_queue_time_ms    0
zk_avg_jvm_pause_time_ms    0.0
zk_min_jvm_pause_time_ms    0
zk_max_jvm_pause_time_ms    0
zk_cnt_jvm_pause_time_ms    0
zk_sum_jvm_pause_time_ms    0
zk_p50_jvm_pause_time_ms    0
zk_p95_jvm_pause_time_ms    0
zk_p99_jvm_pause_time_ms    0
zk_p999_jvm_pause_time_ms   0
zk_avg_close_session_prep_time  0.0
zk_min_close_session_prep_time  0
zk_max_close_session_prep_time  0
zk_cnt_close_session_prep_time  0
zk_sum_close_session_prep_time  0
zk_p50_close_session_prep_time  0
zk_p95_close_session_prep_time  0
zk_p99_close_session_prep_time  0
zk_p999_close_session_prep_time 0
zk_avg_read_commitproc_time_ms  0.0
zk_min_read_commitproc_time_ms  0
zk_max_read_commitproc_time_ms  0
zk_cnt_read_commitproc_time_ms  0
zk_sum_read_commitproc_time_ms  0
zk_p50_read_commitproc_time_ms  0
zk_p95_read_commitproc_time_ms  0
zk_p99_read_commitproc_time_ms  0
zk_p999_read_commitproc_time_ms 0
zk_avg_updatelatency    0.0
zk_min_updatelatency    0
zk_max_updatelatency    0
zk_cnt_updatelatency    0
zk_sum_updatelatency    0
zk_p50_updatelatency    0
zk_p95_updatelatency    0
zk_p99_updatelatency    0
zk_p999_updatelatency   0
zk_avg_local_write_committed_time_ms    0.0
zk_min_local_write_committed_time_ms    0
zk_max_local_write_committed_time_ms    0
zk_cnt_local_write_committed_time_ms    0
zk_sum_local_write_committed_time_ms    0
zk_p50_local_write_committed_time_ms    0
zk_p95_local_write_committed_time_ms    0
zk_p99_local_write_committed_time_ms    0
zk_p999_local_write_committed_time_ms   0
zk_avg_request_throttle_queue_time_ms   0.0
zk_min_request_throttle_queue_time_ms   0
zk_max_request_throttle_queue_time_ms   0
zk_cnt_request_throttle_queue_time_ms   0
zk_sum_request_throttle_queue_time_ms   0
zk_p50_request_throttle_queue_time_ms   0
zk_p95_request_throttle_queue_time_ms   0
zk_p99_request_throttle_queue_time_ms   0
zk_p999_request_throttle_queue_time_ms  0
zk_avg_readlatency  0.0
zk_min_readlatency  0
zk_max_readlatency  0
zk_cnt_readlatency  0
zk_sum_readlatency  0
zk_p50_readlatency  0
zk_p95_readlatency  0
zk_p99_readlatency  0
zk_p999_readlatency 0
zk_avg_quorum_ack_latency   0.0
zk_min_quorum_ack_latency   0
zk_max_quorum_ack_latency   0
zk_cnt_quorum_ack_latency   0
zk_sum_quorum_ack_latency   0
zk_p50_quorum_ack_latency   0
zk_p95_quorum_ack_latency   0
zk_p99_quorum_ack_latency   0
zk_p999_quorum_ack_latency  0
zk_avg_om_commit_process_time_ms    0.0
zk_min_om_commit_process_time_ms    0
zk_max_om_commit_process_time_ms    0
zk_cnt_om_commit_process_time_ms    0
zk_sum_om_commit_process_time_ms    0
zk_p50_om_commit_process_time_ms    0
zk_p95_om_commit_process_time_ms    0
zk_p99_om_commit_process_time_ms    0
zk_p999_om_commit_process_time_ms   0
zk_avg_read_final_proc_time_ms  0.0
zk_min_read_final_proc_time_ms  0
zk_max_read_final_proc_time_ms  0
zk_cnt_read_final_proc_time_ms  0
zk_sum_read_final_proc_time_ms  0
zk_p50_read_final_proc_time_ms  0
zk_p95_read_final_proc_time_ms  0
zk_p99_read_final_proc_time_ms  0
zk_p999_read_final_proc_time_ms 0
zk_avg_commit_propagation_latency   0.0
zk_min_commit_propagation_latency   0
zk_max_commit_propagation_latency   0
zk_cnt_commit_propagation_latency   0
zk_sum_commit_propagation_latency   0
zk_p50_commit_propagation_latency   0
zk_p95_commit_propagation_latency   0
zk_p99_commit_propagation_latency   0
zk_p999_commit_propagation_latency  0
zk_avg_dead_watchers_cleaner_latency    0.0
zk_min_dead_watchers_cleaner_latency    0
zk_max_dead_watchers_cleaner_latency    0
zk_cnt_dead_watchers_cleaner_latency    0
zk_sum_dead_watchers_cleaner_latency    0
zk_p50_dead_watchers_cleaner_latency    0
zk_p95_dead_watchers_cleaner_latency    0
zk_p99_dead_watchers_cleaner_latency    0
zk_p999_dead_watchers_cleaner_latency   0
zk_avg_write_final_proc_time_ms 0.0
zk_min_write_final_proc_time_ms 0
zk_max_write_final_proc_time_ms 0
zk_cnt_write_final_proc_time_ms 0
zk_sum_write_final_proc_time_ms 0
zk_p50_write_final_proc_time_ms 0
zk_p95_write_final_proc_time_ms 0
zk_p99_write_final_proc_time_ms 0
zk_p999_write_final_proc_time_ms    0
zk_avg_proposal_ack_creation_latency    0.0
zk_min_proposal_ack_creation_latency    0
zk_max_proposal_ack_creation_latency    0
zk_cnt_proposal_ack_creation_latency    0
zk_sum_proposal_ack_creation_latency    0
zk_p50_proposal_ack_creation_latency    0
zk_p95_proposal_ack_creation_latency    0
zk_p99_proposal_ack_creation_latency    0
zk_p999_proposal_ack_creation_latency   0
zk_avg_proposal_latency 0.0
zk_min_proposal_latency 0
zk_max_proposal_latency 0
zk_cnt_proposal_latency 0
zk_sum_proposal_latency 0
zk_p50_proposal_latency 0
zk_p95_proposal_latency 0
zk_p99_proposal_latency 0
zk_p999_proposal_latency    0
zk_avg_om_proposal_process_time_ms  0.0
zk_min_om_proposal_process_time_ms  0
zk_max_om_proposal_process_time_ms  0
zk_cnt_om_proposal_process_time_ms  0
zk_sum_om_proposal_process_time_ms  0
zk_p50_om_proposal_process_time_ms  0
zk_p95_om_proposal_process_time_ms  0
zk_p99_om_proposal_process_time_ms  0
zk_p999_om_proposal_process_time_ms 0
zk_avg_sync_processor_queue_and_flush_time_ms   0.0
zk_min_sync_processor_queue_and_flush_time_ms   0
zk_max_sync_processor_queue_and_flush_time_ms   0
zk_cnt_sync_processor_queue_and_flush_time_ms   0
zk_sum_sync_processor_queue_and_flush_time_ms   0
zk_p50_sync_processor_queue_and_flush_time_ms   0
zk_p95_sync_processor_queue_and_flush_time_ms   0
zk_p99_sync_processor_queue_and_flush_time_ms   0
zk_p999_sync_processor_queue_and_flush_time_ms  0
zk_avg_propagation_latency  0.0
zk_min_propagation_latency  0
zk_max_propagation_latency  0
zk_cnt_propagation_latency  0
zk_sum_propagation_latency  0
zk_p50_propagation_latency  0
zk_p95_propagation_latency  0
zk_p99_propagation_latency  0
zk_p999_propagation_latency 0
zk_avg_server_write_committed_time_ms   0.0
zk_min_server_write_committed_time_ms   0
zk_max_server_write_committed_time_ms   0
zk_cnt_server_write_committed_time_ms   0
zk_sum_server_write_committed_time_ms   0
zk_p50_server_write_committed_time_ms   0
zk_p95_server_write_committed_time_ms   0
zk_p99_server_write_committed_time_ms   0
zk_p999_server_write_committed_time_ms  0
zk_avg_sync_processor_queue_time_ms 0.0
zk_min_sync_processor_queue_time_ms 0
zk_max_sync_processor_queue_time_ms 0
zk_cnt_sync_processor_queue_time_ms 0
zk_sum_sync_processor_queue_time_ms 0
zk_p50_sync_processor_queue_time_ms 0
zk_p95_sync_processor_queue_time_ms 0
zk_p99_sync_processor_queue_time_ms 0
zk_p999_sync_processor_queue_time_ms    0
zk_avg_sync_processor_queue_flush_time_ms   0.0
zk_min_sync_processor_queue_flush_time_ms   0
zk_max_sync_processor_queue_flush_time_ms   0
zk_cnt_sync_processor_queue_flush_time_ms   0
zk_sum_sync_processor_queue_flush_time_ms   0
zk_p50_sync_processor_queue_flush_time_ms   0
zk_p95_sync_processor_queue_flush_time_ms   0
zk_p99_sync_processor_queue_flush_time_ms   0
zk_p999_sync_processor_queue_flush_time_ms  0
zk_avg_write_commitproc_time_ms 0.0
zk_min_write_commitproc_time_ms 0
zk_max_write_commitproc_time_ms 0
zk_cnt_write_commitproc_time_ms 0
zk_sum_write_commitproc_time_ms 0
zk_p50_write_commitproc_time_ms 0
zk_p95_write_commitproc_time_ms 0
zk_p99_write_commitproc_time_ms 0
zk_p999_write_commitproc_time_ms    0
dmytronasyrov commented 2 years ago

@powersj JVM FLAGS: -Xmx$HEAP -Xms$HEAP -XX:+AlwaysPreTouch -verbose:gc -Xlog:gc:$LOG_DIR/zookeeper_gc.log -Djute.maxbuffer=8388608 -XX:MaxGCPauseMillis=50

Other settings:

SERVERS=3
CLIENT_PORT=2181
ELECTION_PORT=3888
SERVER_PORT=2888
TICK_TIME=2000
INIT_LIMIT=5
SYNC_LIMIT=2
HEAP=384m
MAX_CLIENT_CNXNS=4096
SNAP_RETAIN_COUNT=3
PURGE_INTERVAL=1
MAX_SESSION_TIMEOUT=40000
MIN_SESSION_TIMEOUT=4000
ZOOKEEPER_SERVERS=3
LOG_LEVEL=INFO
powersj commented 2 years ago

It appears that you are running zookeeper with Prometheus metrics enabled. To do this a user has to enable the following in the zookeeper config:

metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider 

Once I added that to my docker command:

docker run --rm --net=host \
    --env ZOO_4LW_COMMANDS_WHITELIST="mntr" \
    --env ZOO_CFG_EXTRA="metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider" \
    zookeeper

Zookeeper produces the additional metrics:

$ echo mntr | nc localhost 2181
zk_version  3.8.0-5a02a05eddb59aee6ac762f7ea82e92a68eb9c0f, built on 2022-02-25 08:49 UTC
zk_server_state standalone
zk_uptime   11438.0
zk_om_proposal_process_time_ms{quantile="0.5"}  NaN
zk_om_proposal_process_time_ms{quantile="0.9"}  NaN
zk_om_proposal_process_time_ms{quantile="0.99"} NaN
zk_om_proposal_process_time_ms_count    0.0
zk_om_proposal_process_time_ms_sum  0.0
zk_node_children_watch_count{quantile="0.5"}    NaN
zk_node_children_watch_count_count  0.0
zk_node_children_watch_count_sum    0.0
zk_pending_session_queue_size{quantile="0.5"}   NaN
zk_pending_session_queue_size_count 0.0
zk_pending_session_queue_size_sum   0.0
zk_om_commit_process_time_ms{quantile="0.5"}    NaN
zk_om_commit_process_time_ms{quantile="0.9"}    NaN
zk_om_commit_process_time_ms{quantile="0.99"}   NaN
zk_om_commit_process_time_ms_count  0.0
zk_om_commit_process_time_ms_sum    0.0
zk_commit_process_time{quantile="0.5"}  NaN
zk_commit_process_time_count    0.0
zk_commit_process_time_sum  0.0
zk_jvm_memory_bytes_used{area="heap"}   3.4135272E7
zk_jvm_memory_bytes_used{area="nonheap"}    2.1014352E7
zk_jvm_memory_bytes_committed{area="heap"}  5.28482304E8
zk_jvm_memory_bytes_committed{area="nonheap"}   2.588672E7
zk_jvm_memory_bytes_max{area="heap"}    1.048576E9
zk_jvm_memory_bytes_max{area="nonheap"} -1.0
zk_jvm_memory_bytes_init{area="heap"}   5.26385152E8
zk_jvm_memory_bytes_init{area="nonheap"}    7667712.0
zk_jvm_memory_pool_bytes_used{pool="CodeHeap 'non-nmethods'"}   1834112.0
zk_jvm_memory_pool_bytes_used{pool="Metaspace"} 1.3287928E7
zk_jvm_memory_pool_bytes_used{pool="CodeHeap 'profiled nmethods'"}  3673344.0
zk_jvm_memory_pool_bytes_used{pool="Compressed Class Space"}    1637336.0
zk_jvm_memory_pool_bytes_used{pool="G1 Eden Space"} 2.9360128E7
zk_jvm_memory_pool_bytes_used{pool="G1 Old Gen"}    580840.0
zk_jvm_memory_pool_bytes_used{pool="G1 Survivor Space"} 4194304.0
zk_jvm_memory_pool_bytes_used{pool="CodeHeap 'non-profiled nmethods'"}  581632.0
zk_jvm_memory_pool_bytes_committed{pool="CodeHeap 'non-nmethods'"}  3604480.0
zk_jvm_memory_pool_bytes_committed{pool="Metaspace"}    1.4024704E7
zk_jvm_memory_pool_bytes_committed{pool="CodeHeap 'profiled nmethods'"} 3735552.0
zk_jvm_memory_pool_bytes_committed{pool="Compressed Class Space"}   1966080.0
zk_jvm_memory_pool_bytes_committed{pool="G1 Eden Space"}    3.28204288E8
zk_jvm_memory_pool_bytes_committed{pool="G1 Old Gen"}   1.96083712E8
zk_jvm_memory_pool_bytes_committed{pool="G1 Survivor Space"}    4194304.0
zk_jvm_memory_pool_bytes_committed{pool="CodeHeap 'non-profiled nmethods'"} 2555904.0
zk_jvm_memory_pool_bytes_max{pool="CodeHeap 'non-nmethods'"}    8183808.0
zk_jvm_memory_pool_bytes_max{pool="Metaspace"}  -1.0
zk_jvm_memory_pool_bytes_max{pool="CodeHeap 'profiled nmethods'"}   1.21737216E8
zk_jvm_memory_pool_bytes_max{pool="Compressed Class Space"} 1.073741824E9
zk_jvm_memory_pool_bytes_max{pool="G1 Eden Space"}  -1.0
zk_jvm_memory_pool_bytes_max{pool="G1 Old Gen"} 1.048576E9
zk_jvm_memory_pool_bytes_max{pool="G1 Survivor Space"}  -1.0
zk_jvm_memory_pool_bytes_max{pool="CodeHeap 'non-profiled nmethods'"}   1.21737216E8
zk_jvm_memory_pool_bytes_init{pool="CodeHeap 'non-nmethods'"}   2555904.0
zk_jvm_memory_pool_bytes_init{pool="Metaspace"} 0.0
zk_jvm_memory_pool_bytes_init{pool="CodeHeap 'profiled nmethods'"}  2555904.0
zk_jvm_memory_pool_bytes_init{pool="Compressed Class Space"}    0.0
zk_jvm_memory_pool_bytes_init{pool="G1 Eden Space"} 2.8311552E7
zk_jvm_memory_pool_bytes_init{pool="G1 Old Gen"}    4.980736E8
zk_jvm_memory_pool_bytes_init{pool="G1 Survivor Space"} 0.0
zk_jvm_memory_pool_bytes_init{pool="CodeHeap 'non-profiled nmethods'"}  2555904.0
zk_connection_request_count 0.0
zk_connection_drop_count    0.0
zk_process_cpu_seconds_total    1.29
zk_process_start_time_seconds   1.659727338175E9
zk_process_open_fds 139.0
zk_process_max_fds  1048576.0
zk_process_virtual_memory_bytes 8.327602176E9
zk_process_resident_memory_bytes    1.14618368E8
zk_response_packet_get_children_cache_hits  0.0
zk_unsuccessful_handshake   0.0
zk_propagation_latency{quantile="0.5"}  NaN
zk_propagation_latency{quantile="0.9"}  NaN
zk_propagation_latency{quantile="0.99"} NaN
zk_propagation_latency_count    0.0
zk_propagation_latency_sum  0.0
zk_stale_sessions_expired   0.0
zk_connection_token_deficit{quantile="0.5"} NaN
zk_connection_token_deficit_count   0.0
zk_connection_token_deficit_sum 0.0
zk_bytes_received_count 8.0
zk_response_packet_cache_misses 0.0
zk_looking_count    0.0
zk_prep_process_time{quantile="0.5"}    NaN
zk_prep_process_time_count  0.0
zk_prep_process_time_sum    0.0
zk_commit_propagation_latency{quantile="0.5"}   NaN
zk_commit_propagation_latency{quantile="0.9"}   NaN
zk_commit_propagation_latency{quantile="0.99"}  NaN
zk_commit_propagation_latency_count 0.0
zk_commit_propagation_latency_sum   0.0
zk_znode_count  5.0
zk_dbinittime{quantile="0.5"}   4.0
zk_dbinittime_count 1.0
zk_dbinittime_sum   4.0
zk_jvm_gc_collection_seconds_count{gc="G1 Young Generation"}    1.0
zk_jvm_gc_collection_seconds_sum{gc="G1 Young Generation"}  0.005
zk_jvm_gc_collection_seconds_count{gc="G1 Old Generation"}  0.0
zk_jvm_gc_collection_seconds_sum{gc="G1 Old Generation"}    0.0
zk_concurrent_request_processing_in_commit_processor{quantile="0.5"}    NaN
zk_concurrent_request_processing_in_commit_processor_count  0.0
zk_concurrent_request_processing_in_commit_processor_sum    0.0
zk_snap_count   0.0
zk_dead_watchers_cleared    0.0
zk_connection_rejected  0.0
zk_prep_processor_queue_time_ms{quantile="0.5"} NaN
zk_prep_processor_queue_time_ms{quantile="0.9"} NaN
zk_prep_processor_queue_time_ms{quantile="0.99"}    NaN
zk_prep_processor_queue_time_ms_count   0.0
zk_prep_processor_queue_time_ms_sum 0.0
zk_node_created_watch_count{quantile="0.5"} NaN
zk_node_created_watch_count_count   0.0
zk_node_created_watch_count_sum 0.0
zk_sync_processor_queue_size{quantile="0.5"}    0.0
zk_sync_processor_queue_size_count  1.0
zk_sync_processor_queue_size_sum    0.0
zk_follower_sync_time{quantile="0.5"}   NaN
zk_follower_sync_time_count 0.0
zk_follower_sync_time_sum   0.0
zk_revalidate_count 0.0
zk_sync_processor_queue_and_flush_time_ms{quantile="0.5"}   NaN
zk_sync_processor_queue_and_flush_time_ms{quantile="0.9"}   NaN
zk_sync_processor_queue_and_flush_time_ms{quantile="0.99"}  NaN
zk_sync_processor_queue_and_flush_time_ms_count 0.0
zk_sync_processor_queue_and_flush_time_ms_sum   0.0
zk_outstanding_changes_queued   0.0
zk_dead_watchers_queued 0.0
zk_node_changed_watch_count{quantile="0.5"} NaN
zk_node_changed_watch_count_count   0.0
zk_node_changed_watch_count_sum 0.0
zk_requests_not_forwarded_to_commit_processor   0.0
zk_watch_count  0.0
zk_last_client_response_size    -1.0
zk_read_commit_proc_issued{quantile="0.5"}  NaN
zk_read_commit_proc_issued_count    0.0
zk_read_commit_proc_issued_sum  0.0
zk_proposal_latency{quantile="0.5"} NaN
zk_proposal_latency{quantile="0.9"} NaN
zk_proposal_latency{quantile="0.99"}    NaN
zk_proposal_latency_count   0.0
zk_proposal_latency_sum 0.0
zk_readlatency{quantile="0.5"}  NaN
zk_readlatency{quantile="0.9"}  NaN
zk_readlatency{quantile="0.99"} NaN
zk_readlatency_count    0.0
zk_readlatency_sum  0.0
zk_unavailable_time{quantile="0.5"} NaN
zk_unavailable_time_count   0.0
zk_unavailable_time_sum 0.0
zk_sync_processor_batch_size{quantile="0.5"}    NaN
zk_sync_processor_batch_size_count  0.0
zk_sync_processor_batch_size_sum    0.0
zk_packets_sent 3.0
zk_max_latency  0.0
zk_stale_requests_dropped   0.0
zk_request_throttle_wait_count  0.0
zk_learner_proposal_received_count  0.0
zk_write_commit_proc_issued{quantile="0.5"} NaN
zk_write_commit_proc_issued_count   0.0
zk_write_commit_proc_issued_sum 0.0
zk_ensemble_auth_skip   0.0
zk_stale_requests   0.0
zk_startup_txns_loaded{quantile="0.5"}  NaN
zk_startup_txns_loaded_count    0.0
zk_startup_txns_loaded_sum  0.0
zk_server_write_committed_time_ms{quantile="0.5"}   NaN
zk_server_write_committed_time_ms{quantile="0.9"}   NaN
zk_server_write_committed_time_ms{quantile="0.99"}  NaN
zk_server_write_committed_time_ms_count 0.0
zk_server_write_committed_time_ms_sum   0.0
zk_proposal_count   0.0
zk_proposal_ack_creation_latency{quantile="0.5"}    NaN
zk_proposal_ack_creation_latency{quantile="0.9"}    NaN
zk_proposal_ack_creation_latency{quantile="0.99"}   NaN
zk_proposal_ack_creation_latency_count  0.0
zk_proposal_ack_creation_latency_sum    0.0
zk_startup_snap_load_time{quantile="0.5"}   1.0
zk_startup_snap_load_time_count 1.0
zk_startup_snap_load_time_sum   1.0
zk_write_batch_time_in_commit_processor{quantile="0.5"} NaN
zk_write_batch_time_in_commit_processor_count   0.0
zk_write_batch_time_in_commit_processor_sum 0.0
zk_ephemerals_count 0.0
zk_jvm_threads_current  53.0
zk_jvm_threads_daemon   14.0
zk_jvm_threads_peak 53.0
zk_jvm_threads_started_total    53.0
zk_jvm_threads_deadlocked   0.0
zk_jvm_threads_deadlocked_monitor   0.0
zk_jvm_threads_state{state="TERMINATED"}    0.0
zk_jvm_threads_state{state="NEW"}   0.0
zk_jvm_threads_state{state="WAITING"}   13.0
zk_jvm_threads_state{state="BLOCKED"}   0.0
zk_jvm_threads_state{state="TIMED_WAITING"} 5.0
zk_jvm_threads_state{state="RUNNABLE"}  35.0
zk_sync_processor_request_queued    0.0
zk_prep_processor_queue_size{quantile="0.5"}    0.0
zk_prep_processor_queue_size_count  1.0
zk_prep_processor_queue_size_sum    0.0
zk_time_waiting_empty_pool_in_commit_processor_read_ms{quantile="0.5"}  NaN
zk_time_waiting_empty_pool_in_commit_processor_read_ms_count    0.0
zk_time_waiting_empty_pool_in_commit_processor_read_ms_sum  0.0
zk_session_queues_drained{quantile="0.5"}   NaN
zk_session_queues_drained_count 0.0
zk_session_queues_drained_sum   0.0
zk_sync_process_time{quantile="0.5"}    NaN
zk_sync_process_time_count  0.0
zk_sync_process_time_sum    0.0
zk_large_requests_rejected  0.0
zk_outstanding_requests 0.0
zk_max_file_descriptor_count    1048576.0
zk_request_commit_queued    0.0
zk_connection_revalidate_count  0.0
zk_local_write_committed_time_ms{quantile="0.5"}    NaN
zk_local_write_committed_time_ms{quantile="0.9"}    NaN
zk_local_write_committed_time_ms{quantile="0.99"}   NaN
zk_local_write_committed_time_ms_count  0.0
zk_local_write_committed_time_ms_sum    0.0
zk_outstanding_changes_removed  0.0
zk_requests_in_session_queue{quantile="0.5"}    NaN
zk_requests_in_session_queue_count  0.0
zk_requests_in_session_queue_sum    0.0
zk_fsynctime{quantile="0.5"}    NaN
zk_fsynctime_count  0.0
zk_fsynctime_sum    0.0
zk_sync_processor_queue_flush_time_ms{quantile="0.5"}   NaN
zk_sync_processor_queue_flush_time_ms{quantile="0.9"}   NaN
zk_sync_processor_queue_flush_time_ms{quantile="0.99"}  NaN
zk_sync_processor_queue_flush_time_ms_count 0.0
zk_sync_processor_queue_flush_time_ms_sum   0.0
zk_response_bytes   0.0
zk_stale_replies    0.0
zk_insecure_admin_count 0.0
zk_write_final_proc_time_ms{quantile="0.5"} NaN
zk_write_final_proc_time_ms{quantile="0.9"} NaN
zk_write_final_proc_time_ms{quantile="0.99"}    NaN
zk_write_final_proc_time_ms_count   0.0
zk_write_final_proc_time_ms_sum 0.0
zk_cnxn_closed_without_zk_server_running    0.0
zk_reads_after_write_in_session_queue{quantile="0.5"}   NaN
zk_reads_after_write_in_session_queue_count 0.0
zk_reads_after_write_in_session_queue_sum   0.0
zk_commit_commit_proc_req_queued{quantile="0.5"}    NaN
zk_commit_commit_proc_req_queued_count  0.0
zk_commit_commit_proc_req_queued_sum    0.0
zk_min_client_response_size -1.0
zk_max_client_response_size -1.0
zk_connection_drop_probability  0.0
zk_sessionless_connections_expired  0.0
zk_non_mtls_local_conn_count    0.0
zk_close_session_prep_time{quantile="0.5"}  NaN
zk_close_session_prep_time{quantile="0.9"}  NaN
zk_close_session_prep_time{quantile="0.99"} NaN
zk_close_session_prep_time_count    0.0
zk_close_session_prep_time_sum  0.0
zk_leader_unavailable_time{quantile="0.5"}  NaN
zk_leader_unavailable_time_count    0.0
zk_leader_unavailable_time_sum  0.0
zk_watch_bytes  0.0
zk_jvm_memory_pool_allocated_bytes_total{pool="CodeHeap 'profiled nmethods'"}   2883584.0
zk_jvm_memory_pool_allocated_bytes_total{pool="G1 Old Gen"} 580840.0
zk_jvm_memory_pool_allocated_bytes_total{pool="G1 Eden Space"}  2.62144E7
zk_jvm_memory_pool_allocated_bytes_total{pool="CodeHeap 'non-profiled nmethods'"}   451712.0
zk_jvm_memory_pool_allocated_bytes_total{pool="G1 Survivor Space"}  4194304.0
zk_jvm_memory_pool_allocated_bytes_total{pool="Compressed Class Space"} 1242144.0
zk_jvm_memory_pool_allocated_bytes_total{pool="Metaspace"}  1.00932E7
zk_jvm_memory_pool_allocated_bytes_total{pool="CodeHeap 'non-nmethods'"}    3538816.0
zk_write_commit_proc_req_queued{quantile="0.5"} NaN
zk_write_commit_proc_req_queued_count   0.0
zk_write_commit_proc_req_queued_sum 0.0
zk_election_time{quantile="0.5"}    NaN
zk_election_time_count  0.0
zk_election_time_sum    0.0
zk_unrecoverable_error_count    0.0
zk_commit_count 0.0
zk_node_deleted_watch_count{quantile="0.5"} NaN
zk_node_deleted_watch_count_count   0.0
zk_node_deleted_watch_count_sum 0.0
zk_outstanding_tls_handshake    0.0
zk_global_sessions  0.0
zk_response_packet_cache_hits   0.0
zk_quit_leading_due_to_disloyal_voter   0.0
zk_non_mtls_remote_conn_count   0.0
zk_digest_mismatches_count  0.0
zk_learner_request_processor_queue_size{quantile="0.5"} NaN
zk_learner_request_processor_queue_size_count   0.0
zk_learner_request_processor_queue_size_sum 0.0
zk_jvm_pause_time_ms{quantile="0.5"}    NaN
zk_jvm_pause_time_ms{quantile="0.9"}    NaN
zk_jvm_pause_time_ms{quantile="0.99"}   NaN
zk_jvm_pause_time_ms_count  0.0
zk_jvm_pause_time_ms_sum    0.0
zk_reads_issued_from_session_queue{quantile="0.5"}  NaN
zk_reads_issued_from_session_queue_count    0.0
zk_reads_issued_from_session_queue_sum  0.0
zk_min_latency  0.0
zk_jvm_info{version="11.0.15+10",vendor="Oracle Corporation",runtime="OpenJDK Runtime Environment"} 1.0
zk_throttled_ops    0.0
zk_inflight_snap_count{quantile="0.5"}  NaN
zk_inflight_snap_count_count    0.0
zk_inflight_snap_count_sum  0.0
zk_observer_sync_time{quantile="0.5"}   NaN
zk_observer_sync_time_count 0.0
zk_observer_sync_time_sum   0.0
zk_learner_commit_received_count    0.0
zk_approximate_data_size    44.0
zk_write_commitproc_time_ms{quantile="0.5"} NaN
zk_write_commitproc_time_ms{quantile="0.9"} NaN
zk_write_commitproc_time_ms{quantile="0.99"}    NaN
zk_write_commitproc_time_ms_count   0.0
zk_write_commitproc_time_ms_sum 0.0
zk_netty_queued_buffer_capacity{quantile="0.5"} NaN
zk_netty_queued_buffer_capacity_count   0.0
zk_netty_queued_buffer_capacity_sum 0.0
zk_packets_received 2.0
zk_tls_handshake_exceeded   0.0
zk_num_alive_connections    1.0
zk_updatelatency{quantile="0.5"}    NaN
zk_updatelatency{quantile="0.9"}    NaN
zk_updatelatency{quantile="0.99"}   NaN
zk_updatelatency_count  0.0
zk_updatelatency_sum    0.0
zk_add_dead_watcher_stall_time  0.0
zk_ensemble_auth_success    0.0
zk_avg_latency  0.0
zk_startup_txns_load_time{quantile="0.5"}   NaN
zk_startup_txns_load_time_count 0.0
zk_startup_txns_load_time_sum   0.0
zk_request_throttle_queue_time_ms{quantile="0.5"}   NaN
zk_request_throttle_queue_time_ms{quantile="0.9"}   NaN
zk_request_throttle_queue_time_ms{quantile="0.99"}  NaN
zk_request_throttle_queue_time_ms_count 0.0
zk_request_throttle_queue_time_ms_sum   0.0
zk_dead_watchers_cleaner_latency{quantile="0.5"}    NaN
zk_dead_watchers_cleaner_latency{quantile="0.9"}    NaN
zk_dead_watchers_cleaner_latency{quantile="0.99"}   NaN
zk_dead_watchers_cleaner_latency_count  0.0
zk_dead_watchers_cleaner_latency_sum    0.0
zk_response_packet_get_children_cache_misses    0.0
zk_auth_failed_count    0.0
zk_open_file_descriptor_count   139.0
zk_read_final_proc_time_ms{quantile="0.5"}  NaN
zk_read_final_proc_time_ms{quantile="0.9"}  NaN
zk_read_final_proc_time_ms{quantile="0.99"} NaN
zk_read_final_proc_time_ms_count    0.0
zk_read_final_proc_time_ms_sum  0.0
zk_diff_count   0.0
zk_jvm_classes_loaded   3408.0
zk_jvm_classes_loaded_total 3408.0
zk_jvm_classes_unloaded_total   0.0
zk_prep_processor_request_queued    0.0
zk_inflight_diff_count{quantile="0.5"}  NaN
zk_inflight_diff_count_count    0.0
zk_inflight_diff_count_sum  0.0
zk_snapshottime{quantile="0.5"} 0.0
zk_snapshottime_count   1.0
zk_snapshottime_sum 0.0
zk_quorum_ack_latency{quantile="0.5"}   NaN
zk_quorum_ack_latency{quantile="0.9"}   NaN
zk_quorum_ack_latency{quantile="0.99"}  NaN
zk_quorum_ack_latency_count 0.0
zk_quorum_ack_latency_sum   0.0
zk_proposal_process_time{quantile="0.5"}    NaN
zk_proposal_process_time_count  0.0
zk_proposal_process_time_sum    0.0
zk_socket_closing_time{quantile="0.5"}  NaN
zk_socket_closing_time_count    0.0
zk_socket_closing_time_sum  0.0
zk_local_sessions   0.0
zk_read_commit_proc_req_queued{quantile="0.5"}  NaN
zk_read_commit_proc_req_queued_count    0.0
zk_read_commit_proc_req_queued_sum  0.0
zk_read_commitproc_time_ms{quantile="0.5"}  NaN
zk_read_commitproc_time_ms{quantile="0.9"}  NaN
zk_read_commitproc_time_ms{quantile="0.99"} NaN
zk_read_commitproc_time_ms_count    0.0
zk_read_commitproc_time_ms_sum  0.0
zk_sync_processor_queue_time_ms{quantile="0.5"} NaN
zk_sync_processor_queue_time_ms{quantile="0.9"} NaN
zk_sync_processor_queue_time_ms{quantile="0.99"}    NaN
zk_sync_processor_queue_time_ms_count   0.0
zk_sync_processor_queue_time_ms_sum 0.0
zk_jvm_buffer_pool_used_bytes{pool="mapped"}    0.0
zk_jvm_buffer_pool_used_bytes{pool="direct"}    24577.0
zk_jvm_buffer_pool_capacity_bytes{pool="mapped"}    0.0
zk_jvm_buffer_pool_capacity_bytes{pool="direct"}    24577.0
zk_jvm_buffer_pool_used_buffers{pool="mapped"}  0.0
zk_jvm_buffer_pool_used_buffers{pool="direct"}  4.0
zk_ensemble_auth_fail   0.0
zk_skip_learner_request_to_next_processor_count 0.0

I'll need to confirm with the team about the format, but using the Prometheus parser. one of the first metrics does not parse very well since it is a string value.

The rest look like this:

zookeeper om_commit_process_time_ms_count=0 1659730360000000000
zookeeper om_commit_process_time_ms_sum=0 1659730360000000000
zookeeper add_dead_watcher_stall_time=0 1659730360000000000
zookeeper reads_after_write_in_session_queue_count=0 1659730360000000000
zookeeper reads_after_write_in_session_queue_sum=0 1659730360000000000
zookeeper read_final_proc_time_ms_count=0 1659730360000000000
zookeeper read_final_proc_time_ms_sum=0 1659730360000000000
zookeeper om_proposal_process_time_ms_count=0 1659730360000000000
zookeeper om_proposal_process_time_ms_sum=0 1659730360000000000
zookeeper,area=heap jvm_memory_bytes_used=48791112 1659730360000000000
zookeeper,area=nonheap jvm_memory_bytes_used=21065976 1659730360000000000
zookeeper,area=heap jvm_memory_bytes_committed=528482304 1659730360000000000
zookeeper,area=nonheap jvm_memory_bytes_committed=26476544 1659730360000000000
zookeeper,area=heap jvm_memory_bytes_max=1048576000 1659730360000000000
zookeeper,area=nonheap jvm_memory_bytes_max=-1 1659730360000000000
zookeeper,area=heap jvm_memory_bytes_init=526385152 1659730360000000000
zookeeper,area=nonheap jvm_memory_bytes_init=7667712 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-nmethods' jvm_memory_pool_bytes_used=1258496 1659730360000000000
zookeeper,pool=Metaspace jvm_memory_pool_bytes_used=13441184 1659730360000000000
zookeeper,pool=CodeHeap\ 'profiled\ nmethods' jvm_memory_pool_bytes_used=4060416 1659730360000000000
zookeeper,pool=Compressed\ Class\ Space jvm_memory_pool_bytes_used=1642200 1659730360000000000
zookeeper,pool=G1\ Eden\ Space jvm_memory_pool_bytes_used=44040192 1659730360000000000
zookeeper,pool=G1\ Old\ Gen jvm_memory_pool_bytes_used=556616 1659730360000000000
zookeeper,pool=G1\ Survivor\ Space jvm_memory_pool_bytes_used=4194304 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-profiled\ nmethods' jvm_memory_pool_bytes_used=663680 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-nmethods' jvm_memory_pool_bytes_committed=3604480 1659730360000000000
zookeeper,pool=Metaspace jvm_memory_pool_bytes_committed=14286848 1659730360000000000
zookeeper,pool=CodeHeap\ 'profiled\ nmethods' jvm_memory_pool_bytes_committed=4063232 1659730360000000000
zookeeper,pool=Compressed\ Class\ Space jvm_memory_pool_bytes_committed=1966080 1659730360000000000
zookeeper,pool=G1\ Eden\ Space jvm_memory_pool_bytes_committed=328204288 1659730360000000000
zookeeper,pool=G1\ Old\ Gen jvm_memory_pool_bytes_committed=196083712 1659730360000000000
zookeeper,pool=G1\ Survivor\ Space jvm_memory_pool_bytes_committed=4194304 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-profiled\ nmethods' jvm_memory_pool_bytes_committed=2555904 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-nmethods' jvm_memory_pool_bytes_max=8183808 1659730360000000000
zookeeper,pool=Metaspace jvm_memory_pool_bytes_max=-1 1659730360000000000
zookeeper,pool=CodeHeap\ 'profiled\ nmethods' jvm_memory_pool_bytes_max=121737216 1659730360000000000
zookeeper,pool=Compressed\ Class\ Space jvm_memory_pool_bytes_max=1073741824 1659730360000000000
zookeeper,pool=G1\ Eden\ Space jvm_memory_pool_bytes_max=-1 1659730360000000000
zookeeper,pool=G1\ Old\ Gen jvm_memory_pool_bytes_max=1048576000 1659730360000000000
zookeeper,pool=G1\ Survivor\ Space jvm_memory_pool_bytes_max=-1 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-profiled\ nmethods' jvm_memory_pool_bytes_max=121737216 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-nmethods' jvm_memory_pool_bytes_init=2555904 1659730360000000000
zookeeper,pool=Metaspace jvm_memory_pool_bytes_init=0 1659730360000000000
zookeeper,pool=CodeHeap\ 'profiled\ nmethods' jvm_memory_pool_bytes_init=2555904 1659730360000000000
zookeeper,pool=Compressed\ Class\ Space jvm_memory_pool_bytes_init=0 1659730360000000000
zookeeper,pool=G1\ Eden\ Space jvm_memory_pool_bytes_init=28311552 1659730360000000000
zookeeper,pool=G1\ Old\ Gen jvm_memory_pool_bytes_init=498073600 1659730360000000000
zookeeper,pool=G1\ Survivor\ Space jvm_memory_pool_bytes_init=0 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-profiled\ nmethods' jvm_memory_pool_bytes_init=2555904 1659730360000000000
zookeeper connection_token_deficit_count=0 1659730360000000000
zookeeper connection_token_deficit_sum=0 1659730360000000000
zookeeper connection_revalidate_count=0 1659730360000000000
zookeeper startup_txns_loaded_count=0 1659730360000000000
zookeeper startup_txns_loaded_sum=0 1659730360000000000
zookeeper process_cpu_seconds_total=4.42 1659730360000000000
zookeeper process_start_time_seconds=1659727615.542 1659730360000000000
zookeeper process_open_fds=141 1659730360000000000
zookeeper process_max_fds=1048576 1659730360000000000
zookeeper process_virtual_memory_bytes=9009217536 1659730360000000000
zookeeper process_resident_memory_bytes=136749056 1659730360000000000
zookeeper non_mtls_local_conn_count=0 1659730360000000000
zookeeper response_packet_get_children_cache_misses=0 1659730360000000000
zookeeper insecure_admin_count=0 1659730360000000000
zookeeper follower_sync_time_count=0 1659730360000000000
zookeeper follower_sync_time_sum=0 1659730360000000000
zookeeper unavailable_time_count=0 1659730360000000000
zookeeper unavailable_time_sum=0 1659730360000000000
zookeeper connection_rejected=0 1659730360000000000
zookeeper unrecoverable_error_count=0 1659730360000000000
zookeeper local_sessions=0 1659730360000000000
zookeeper response_packet_get_children_cache_hits=0 1659730360000000000
zookeeper diff_count=0 1659730360000000000
zookeeper proposal_process_time_count=0 1659730360000000000
zookeeper proposal_process_time_sum=0 1659730360000000000
zookeeper learner_proposal_received_count=0 1659730360000000000
zookeeper,gc=G1\ Young\ Generation jvm_gc_collection_seconds_count=1 1659730360000000000
zookeeper,gc=G1\ Young\ Generation jvm_gc_collection_seconds_sum=0.006 1659730360000000000
zookeeper,gc=G1\ Old\ Generation jvm_gc_collection_seconds_count=0 1659730360000000000
zookeeper,gc=G1\ Old\ Generation jvm_gc_collection_seconds_sum=0 1659730360000000000
zookeeper read_commit_proc_req_queued_count=0 1659730360000000000
zookeeper read_commit_proc_req_queued_sum=0 1659730360000000000
zookeeper outstanding_requests=0 1659730360000000000
zookeeper num_alive_connections=1 1659730360000000000
zookeeper commit_count=0 1659730360000000000
zookeeper dead_watchers_cleaner_latency_count=0 1659730360000000000
zookeeper dead_watchers_cleaner_latency_sum=0 1659730360000000000
zookeeper inflight_snap_count_count=0 1659730360000000000
zookeeper inflight_snap_count_sum=0 1659730360000000000
zookeeper prep_processor_queue_size_count=1 1659730360000000000
zookeeper prep_processor_queue_size_sum=0 1659730360000000000
zookeeper znode_count=5 1659730360000000000
zookeeper node_deleted_watch_count_count=0 1659730360000000000
zookeeper node_deleted_watch_count_sum=0 1659730360000000000
zookeeper sync_processor_request_queued=0 1659730360000000000
zookeeper observer_sync_time_count=0 1659730360000000000
zookeeper observer_sync_time_sum=0 1659730360000000000
zookeeper connection_drop_count=0 1659730360000000000
zookeeper avg_latency=0 1659730360000000000
zookeeper sync_processor_queue_size_count=1 1659730360000000000
zookeeper sync_processor_queue_size_sum=0 1659730360000000000
zookeeper outstanding_changes_removed=0 1659730360000000000
zookeeper dead_watchers_cleared=0 1659730360000000000
zookeeper node_children_watch_count_count=0 1659730360000000000
zookeeper node_children_watch_count_sum=0 1659730360000000000
zookeeper response_bytes=0 1659730360000000000
zookeeper readlatency_count=0 1659730360000000000
zookeeper readlatency_sum=0 1659730360000000000
zookeeper write_commit_proc_issued_count=0 1659730360000000000
zookeeper write_commit_proc_issued_sum=0 1659730360000000000
zookeeper watch_count=0 1659730360000000000
zookeeper proposal_ack_creation_latency_count=0 1659730360000000000
zookeeper proposal_ack_creation_latency_sum=0 1659730360000000000
zookeeper outstanding_tls_handshake=0 1659730360000000000
zookeeper updatelatency_count=0 1659730360000000000
zookeeper updatelatency_sum=0 1659730360000000000
zookeeper fsynctime_count=0 1659730360000000000
zookeeper fsynctime_sum=0 1659730360000000000
zookeeper leader_unavailable_time_count=0 1659730360000000000
zookeeper leader_unavailable_time_sum=0 1659730360000000000
zookeeper quorum_ack_latency_count=0 1659730360000000000
zookeeper quorum_ack_latency_sum=0 1659730360000000000
zookeeper stale_replies=0 1659730360000000000
zookeeper large_requests_rejected=0 1659730360000000000
zookeeper learner_commit_received_count=0 1659730360000000000
zookeeper throttled_ops=0 1659730360000000000
zookeeper non_mtls_remote_conn_count=0 1659730360000000000
zookeeper prep_processor_queue_time_ms_count=0 1659730360000000000
zookeeper prep_processor_queue_time_ms_sum=0 1659730360000000000
zookeeper stale_requests_dropped=0 1659730360000000000
zookeeper startup_txns_load_time_count=0 1659730360000000000
zookeeper startup_txns_load_time_sum=0 1659730360000000000
zookeeper commit_process_time_count=0 1659730360000000000
zookeeper commit_process_time_sum=0 1659730360000000000
zookeeper quit_leading_due_to_disloyal_voter=0 1659730360000000000
zookeeper commit_propagation_latency_count=0 1659730360000000000
zookeeper commit_propagation_latency_sum=0 1659730360000000000
zookeeper sync_processor_queue_and_flush_time_ms_count=0 1659730360000000000
zookeeper sync_processor_queue_and_flush_time_ms_sum=0 1659730360000000000
zookeeper proposal_count=0 1659730360000000000
zookeeper concurrent_request_processing_in_commit_processor_count=0 1659730360000000000
zookeeper concurrent_request_processing_in_commit_processor_sum=0 1659730360000000000
zookeeper jvm_threads_current=65 1659730360000000000
zookeeper jvm_threads_daemon=26 1659730360000000000
zookeeper jvm_threads_peak=65 1659730360000000000
zookeeper jvm_threads_started_total=65 1659730360000000000
zookeeper jvm_threads_deadlocked=0 1659730360000000000
zookeeper jvm_threads_deadlocked_monitor=0 1659730360000000000
zookeeper,state=TERMINATED jvm_threads_state=0 1659730360000000000
zookeeper,state=NEW jvm_threads_state=0 1659730360000000000
zookeeper,state=WAITING jvm_threads_state=25 1659730360000000000
zookeeper,state=BLOCKED jvm_threads_state=0 1659730360000000000
zookeeper,state=TIMED_WAITING jvm_threads_state=5 1659730360000000000
zookeeper,state=RUNNABLE jvm_threads_state=35 1659730360000000000
zookeeper sync_processor_queue_time_ms_count=0 1659730360000000000
zookeeper sync_processor_queue_time_ms_sum=0 1659730360000000000
zookeeper prep_processor_request_queued=0 1659730360000000000
zookeeper write_batch_time_in_commit_processor_count=0 1659730360000000000
zookeeper write_batch_time_in_commit_processor_sum=0 1659730360000000000
zookeeper time_waiting_empty_pool_in_commit_processor_read_ms_count=0 1659730360000000000
zookeeper time_waiting_empty_pool_in_commit_processor_read_ms_sum=0 1659730360000000000
zookeeper sync_processor_batch_size_count=0 1659730360000000000
zookeeper sync_processor_batch_size_sum=0 1659730360000000000
zookeeper bytes_received_count=56 1659730360000000000
zookeeper netty_queued_buffer_capacity_count=0 1659730360000000000
zookeeper netty_queued_buffer_capacity_sum=0 1659730360000000000
zookeeper read_commit_proc_issued_count=0 1659730360000000000
zookeeper read_commit_proc_issued_sum=0 1659730360000000000
zookeeper max_file_descriptor_count=1048576 1659730360000000000
zookeeper sessionless_connections_expired=0 1659730360000000000
zookeeper server_write_committed_time_ms_count=0 1659730360000000000
zookeeper server_write_committed_time_ms_sum=0 1659730360000000000
zookeeper prep_process_time_count=0 1659730360000000000
zookeeper prep_process_time_sum=0 1659730360000000000
zookeeper pending_session_queue_size_count=0 1659730360000000000
zookeeper pending_session_queue_size_sum=0 1659730360000000000
zookeeper snapshottime_count=1 1659730360000000000
zookeeper snapshottime_sum=0 1659730360000000000
zookeeper sync_process_time_count=0 1659730360000000000
zookeeper sync_process_time_sum=0 1659730360000000000
zookeeper watch_bytes=0 1659730360000000000
zookeeper request_throttle_queue_time_ms_count=0 1659730360000000000
zookeeper request_throttle_queue_time_ms_sum=0 1659730360000000000
zookeeper tls_handshake_exceeded=0 1659730360000000000
zookeeper auth_failed_count=0 1659730360000000000
zookeeper proposal_latency_count=0 1659730360000000000
zookeeper proposal_latency_sum=0 1659730360000000000
zookeeper skip_learner_request_to_next_processor_count=0 1659730360000000000
zookeeper reads_issued_from_session_queue_count=0 1659730360000000000
zookeeper reads_issued_from_session_queue_sum=0 1659730360000000000
zookeeper request_commit_queued=0 1659730360000000000
zookeeper min_client_response_size=-1 1659730360000000000
zookeeper stale_sessions_expired=0 1659730360000000000
zookeeper revalidate_count=0 1659730360000000000
zookeeper requests_in_session_queue_count=0 1659730360000000000
zookeeper requests_in_session_queue_sum=0 1659730360000000000
zookeeper jvm_pause_time_ms_count=0 1659730360000000000
zookeeper jvm_pause_time_ms_sum=0 1659730360000000000
zookeeper,pool=CodeHeap\ 'profiled\ nmethods' jvm_memory_pool_allocated_bytes_total=2896640 1659730360000000000
zookeeper,pool=G1\ Old\ Gen jvm_memory_pool_allocated_bytes_total=556616 1659730360000000000
zookeeper,pool=G1\ Eden\ Space jvm_memory_pool_allocated_bytes_total=26214400 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-profiled\ nmethods' jvm_memory_pool_allocated_bytes_total=466432 1659730360000000000
zookeeper,pool=G1\ Survivor\ Space jvm_memory_pool_allocated_bytes_total=4194304 1659730360000000000
zookeeper,pool=Compressed\ Class\ Space jvm_memory_pool_allocated_bytes_total=1243360 1659730360000000000
zookeeper,pool=Metaspace jvm_memory_pool_allocated_bytes_total=10110176 1659730360000000000
zookeeper,pool=CodeHeap\ 'non-nmethods' jvm_memory_pool_allocated_bytes_total=3540864 1659730360000000000
zookeeper commit_commit_proc_req_queued_count=0 1659730360000000000
zookeeper commit_commit_proc_req_queued_sum=0 1659730360000000000
zookeeper node_created_watch_count_count=0 1659730360000000000
zookeeper node_created_watch_count_sum=0 1659730360000000000
zookeeper connection_request_count=0 1659730360000000000
zookeeper node_changed_watch_count_count=0 1659730360000000000
zookeeper node_changed_watch_count_sum=0 1659730360000000000
zookeeper response_packet_cache_misses=0 1659730360000000000
zookeeper min_latency=0 1659730360000000000
zookeeper stale_requests=0 1659730360000000000
zookeeper learner_request_processor_queue_size_count=0 1659730360000000000
zookeeper learner_request_processor_queue_size_sum=0 1659730360000000000
zookeeper unsuccessful_handshake=0 1659730360000000000
zookeeper max_latency=0 1659730360000000000
zookeeper connection_drop_probability=0 1659730360000000000
zookeeper session_queues_drained_count=0 1659730360000000000
zookeeper session_queues_drained_sum=0 1659730360000000000
zookeeper,runtime=OpenJDK\ Runtime\ Environment,vendor=Oracle\ Corporation,version=11.0.15+10 jvm_info=1 1659730360000000000
zookeeper read_commitproc_time_ms_count=0 1659730360000000000
zookeeper read_commitproc_time_ms_sum=0 1659730360000000000
zookeeper inflight_diff_count_count=0 1659730360000000000
zookeeper inflight_diff_count_sum=0 1659730360000000000
zookeeper election_time_count=0 1659730360000000000
zookeeper election_time_sum=0 1659730360000000000
zookeeper looking_count=0 1659730360000000000
zookeeper global_sessions=0 1659730360000000000
zookeeper approximate_data_size=44 1659730360000000000
zookeeper packets_sent=27 1659730360000000000
zookeeper local_write_committed_time_ms_count=0 1659730360000000000
zookeeper local_write_committed_time_ms_sum=0 1659730360000000000
zookeeper digest_mismatches_count=0 1659730360000000000
zookeeper cnxn_closed_without_server_running=0 1659730360000000000
zookeeper propagation_latency_count=0 1659730360000000000
zookeeper propagation_latency_sum=0 1659730360000000000
zookeeper dead_watchers_queued=0 1659730360000000000
zookeeper ensemble_auth_fail=0 1659730360000000000
zookeeper last_client_response_size=-1 1659730360000000000
zookeeper startup_snap_load_time_count=1 1659730360000000000
zookeeper startup_snap_load_time_sum=0 1659730360000000000
zookeeper request_throttle_wait_count=0 1659730360000000000
zookeeper open_file_descriptor_count=141 1659730360000000000
zookeeper uptime=2734837 1659730360000000000
zookeeper response_packet_cache_hits=0 1659730360000000000
zookeeper ensemble_auth_success=0 1659730360000000000
zookeeper write_final_proc_time_ms_count=0 1659730360000000000
zookeeper write_final_proc_time_ms_sum=0 1659730360000000000
zookeeper snap_count=0 1659730360000000000
zookeeper jvm_classes_loaded=3412 1659730360000000000
zookeeper jvm_classes_loaded_total=3412 1659730360000000000
zookeeper jvm_classes_unloaded_total=0 1659730360000000000
zookeeper outstanding_changes_queued=0 1659730360000000000
zookeeper ephemerals_count=0 1659730360000000000
zookeeper dbinittime_count=1 1659730360000000000
zookeeper dbinittime_sum=4 1659730360000000000
zookeeper close_session_prep_time_count=0 1659730360000000000
zookeeper close_session_prep_time_sum=0 1659730360000000000
zookeeper requests_not_forwarded_to_commit_processor=0 1659730360000000000
zookeeper write_commit_proc_req_queued_count=0 1659730360000000000
zookeeper write_commit_proc_req_queued_sum=0 1659730360000000000
zookeeper write_commitproc_time_ms_count=0 1659730360000000000
zookeeper write_commitproc_time_ms_sum=0 1659730360000000000
zookeeper sync_processor_queue_flush_time_ms_count=0 1659730360000000000
zookeeper sync_processor_queue_flush_time_ms_sum=0 1659730360000000000
zookeeper max_client_response_size=-1 1659730360000000000
zookeeper,pool=mapped jvm_buffer_pool_used_bytes=0 1659730360000000000
zookeeper,pool=direct jvm_buffer_pool_used_bytes=122881 1659730360000000000
zookeeper,pool=mapped jvm_buffer_pool_capacity_bytes=0 1659730360000000000
zookeeper,pool=direct jvm_buffer_pool_capacity_bytes=122881 1659730360000000000
zookeeper,pool=mapped jvm_buffer_pool_used_buffers=0 1659730360000000000
zookeeper,pool=direct jvm_buffer_pool_used_buffers=16 1659730360000000000
zookeeper ensemble_auth_skip=0 1659730360000000000
zookeeper socket_closing_time_count=0 1659730360000000000
zookeeper socket_closing_time_sum=0 1659730360000000000
zookeeper packets_received=14 1659730360000000000
dmytronasyrov commented 2 years ago

Ah, right. Prometheus metrics are also enabled

powersj commented 2 years ago

@dmytronasyrov I have a draft PR https://github.com/influxdata/telegraf/pull/11642 up now. It adds a new setting, metrics_provider that when set to Prometheus will connect to the exposed Prometheus endpoint on port 7000 (by default) and collect those metrics. So a config would now look like this:

[[inputs.zookeeper]]
      servers = ["zookeeper-0.zookeeper.zookeeper-ns.svc.cluster.local:7000"]
      timeout = "5s"

Note that with Prometheus metrics enabled, the results from mntr are not entirely valid Prometheus metrics, and so using the exposed Prometheus endpoint is the behavior I went with.

If you could give the artifacts from that PR, which should be added as a comment in 15-20mins, a try, I would appreciate it.

fwiw a similar output could easily be obtained via the following as well:

[[inputs.http]]
  urls = ["http://localhost:7000/metrics"]
  data_format = "prometheus"

Thanks!

powersj commented 2 years ago

When I first started looking at this, I was looking at the metrics from mntr with the Prometheus provider. However, that is not the right place to look for metrics per the docs. The docs specifically say to use http://localhost:7000/metrics as the default location. That location provides valid Prometheus metrics. In which case, either the http input plugin as I have above or our own prometheus input plugin can collect and process these.

Rather than creating yet another place to parse Prometheus metrics, I think a better option is to document that when using the Prometheus metrics provider, users should not use the zookeeper input, but rather the Prometheus input.

I'll put up a different PR tomorrow with this instead.