apache / dubbo

The java implementation of Apache Dubbo. An RPC and microservice framework.
https://dubbo.apache.org/
Apache License 2.0
40.51k stars 26.43k forks source link

Verification and test of accuracy and practicability of indicators #11678

Open songxiaosheng opened 1 year ago

songxiaosheng commented 1 year ago

Describe the proposal

EN: At present, there are some monitoring indicators. The accuracy and practicability of the indicators need to be further verified and tested. Normal scenarios, abnormal scenarios and extreme scenarios should be considered during the verification. Relevant indicators will be listed gradually as shown below.

中文:目前已经存在了一些监控指标,指标的准确性与实用性需要进一步的验证与测试,验证的时候要考虑正常场景,异常场景和极端场景等,相关指标如下所示会逐渐列举。

Metrics Name Description
dubbo_thread_pool_max_size Thread Pool Max Size
dubbo_thread_pool_largest_size Thread Pool Largest Size
dubbo_thread_pool_thread_count Thread Pool Thread Count
dubbo_thread_pool_queue_size Thread Pool Queue Size
dubbo_thread_pool_active_size Thread Pool Active Size
dubbo_thread_pool_core_size Thread Pool Core Size
Metrics Name Description
dubbo_consumer_rt_milliseconds_sum Sum Response Time
dubbo_consumer_rt_milliseconds_p99 Response Time P99
dubbo_consumer_rt_milliseconds_avg Average Response Time
dubbo_consumer_rt_milliseconds_last Last Response Time
dubbo_consumer_requests_processing Processing Requests
dubbo_consumer_rt_milliseconds_max Max Response Time
dubbo_consumer_rt_milliseconds_p95 Response Time P95
dubbo_consumer_requests_succeed_aggregate Aggregated Succeed Requests
dubbo_consumer_requests_succeed_total Succeed Requests
dubbo_consumer_requests_total Total Requests
dubbo_consumer_rt_milliseconds_min Min Response Time
dubbo_consumer_requests_total_aggregate Aggregated Total Requests
dubbo_consumer_qps_total Query Per Seconds
dubbo_consumer_requests_failed_total Total Failed Requests
dubbo_consumer_requests_timeout_total Total Timeout Failed Requests
dubbo_consumer_requests_failed_total_aggregate Aggregated failed total Requests
dubbo_consumer_requests_timeout_failed_aggregate Aggregated timeout Failed Requests
Metrics Name Description
dubbo_provider_requests_total Total Requests
dubbo_provider_rt_milliseconds_avg Average Response Time
dubbo_provider_rt_milliseconds_sum Sum Response Time
dubbo_provider_requests_total_aggregate Aggregated Total Requests
dubbo_provider_requests_succeed_total Succeed Requests
dubbo_provider_qps_total Query Per Seconds
dubbo_provider_requests_processing Processing Requests
dubbo_provider_rt_milliseconds_p99 Response Time P99
dubbo_provider_requests_succeed_aggregate Aggregated Succeed Requests
dubbo_provider_rt_milliseconds_p95 Response Time P95
dubbo_provider_rt_milliseconds_max Max Response Time
dubbo_provider_rt_milliseconds_min Min Response Time
dubbo_provider_rt_milliseconds_last Last Response Time
Metrics Name Description
dubbo_register_rt_milliseconds_max Max Response Time
dubbo_register_rt_milliseconds_avg Average Response Time
dubbo_register_rt_milliseconds_sum Sum Response Time
dubbo_register_rt_milliseconds_min Min Response Time
dubbo_registry_register_requests_succeed_total Succeed Register Requests
dubbo_registry_register_requests_total Total Register Requests
dubbo_register_rt_milliseconds_last Last Response Time
Metrics Name Description
dubbo_metadata_push_num_succeed_total Succeed Push Num
dubbo_push_rt_milliseconds_sum Sum Response Time
dubbo_push_rt_milliseconds_last Last Response Time
dubbo_metadata_push_num_total Total Push Num
dubbo_push_rt_milliseconds_min Min Response Time
dubbo_push_rt_milliseconds_max Max Response Time
dubbo_push_rt_milliseconds_avg Average Response Time
Metrics Name Description
dubbo_application_info_total Total Application Info

EN : The test cases are described as follows:

Question 1: dubbo consumer rt milliseconds Sum index Time-out scenario: At present, there is statistical response time for normal response, and when there is an exception, such as timeout, the response time should also be counted. Normal scenario: multiple consumer scenario indicator names are overwritten (metadata request indicators and demoService case requests are mutually overwritten)

中文:测试案例说明如下 问题一:

dubbo_consumer_rt_milliseconds_sum指标

超时场景:在正常响应目前是有统计响应时间的,出现异常的时候例如超时时这个响应时间也是要统计的。

正常场景:指标数据不会增加,可能是多个消费者场景指标名字被覆盖(元数据请求指标和demoService案例请求相互覆盖)

lijay7674 commented 1 year ago

assign to me,thx

dengWuuu commented 1 year ago

please assign me thx

yeeeee7 commented 1 year ago

@songxiaosheng 怎么配置才能采集到这些指标,结合spring-boot-starter-actuator image

songxiaosheng commented 1 year ago

@songxiaosheng 怎么配置才能采集到这些指标,结合spring-boot-starter-actuator image

后面会出下快速入门的文档在官网中,3.2版本发布后可以正式使用现在可以使用beta版本进行体验,目前你可以参考dubbo sample里面的 dubbo metrics promethes案例进行体验

yeeeee7 commented 1 year ago

dubbo.metrics.protocol=prometheus dubbo.metrics.prometheus.exporter.enabled=true dubbo.metrics.aggregation.enabled=true 使用3.2beta6版本配置了这三个参数,只有dubbo_thread_pool_active_size和dubbo_thread_pool_thread_count等参数,没有https://cn.dubbo.apache.org/zh-cn/overview/reference/metrics/standard_metrics/ 这里面的参数,还需要配置什么? @songxiaosheng

songxiaosheng commented 1 year ago

3.2beta6版本版本默认开启了提供者 消费者和线程池等指标 其他指标需要配置开关开启,在新的版本中这些常见指标都会默认开启 下面两个配置也会在将来的新版本废弃掉: dubbo.metrics.protocol=prometheus dubbo.metrics.prometheus.exporter.enabled=true 提供者响应者指标你那里没有的话需要你发起RPC请求之后才可以。

其他相关配置如下(非最终版)

dubbo
  metrics:
    protocol: prometheus
    enable-jvm-metrics: true
    enable-metadata-metrics: true
    enable-registry-metrics: true
    aggregation:
      enabled: true