apache / hertzbeat

Apache HertzBeat(incubating) is a real-time monitoring system with agentless, performance cluster, prometheus-compatible, custom monitoring and status page building capabilities.
https://hertzbeat.apache.org/
Apache License 2.0
5.42k stars 942 forks source link

[Task] Metric name display problem under http protocol prometheus parsing #1992

Closed zhangshenghang closed 3 months ago

zhangshenghang commented 3 months ago

Description

Translation:

When I try to retrieve data via the HTTP protocol and parse it using Prometheus rules, I can't display the names of the metrics in the system.

Configuration:

protocol: http
parseType: prometheus

Data obtained from the interface:

# TYPE log4j2_appender counter
log4j2_appender_total{cluster="standalone", level="debug"} 0.0
log4j2_appender_created{cluster="standalone", level="debug"} 1.715739939787E9
log4j2_appender_total{cluster="standalone", level="warn"} 7.0
log4j2_appender_created{cluster="standalone", level="warn"} 1.715739939787E9
log4j2_appender_total{cluster="standalone", level="trace"} 0.0
log4j2_appender_created{cluster="standalone", level="trace"} 1.715739939787E9
log4j2_appender_total{cluster="standalone", level="error"} 1.0
log4j2_appender_created{cluster="standalone", level="error"} 1.715739939787E9
log4j2_appender_total{cluster="standalone", level="fatal"} 0.0
log4j2_appender_created{cluster="standalone", level="fatal"} 1.715739939787E9
log4j2_appender_total{cluster="standalone", level="info"} 7515.0
log4j2_appender_created{cluster="standalone", level="info"} 1.715739939787E9

Actual display result:

image

Expected display result:

name cluster level value
log4j2_appender_total standalone debug 0
log4j2_appender_created standalone debug 1715739939.787

When the name is not displayed, it's impossible to understand the specific meaning of the monitoring metrics. Therefore, I hope to add a name column.

By examining the HttpCollectImpl.parseResponseByPrometheusExporter method, it seems that the expected result display may not be supported currently.

Is there a need for the community to support my idea? If necessary, I can implement it.

Or is there any support method that I don’t know about yet, I hope you can give me some advice.

Task List

No response

zhangshenghang commented 3 months ago

The same problem will also occur in Doris monitoring

Because the indicator keys under doris_fe_query_latency_ms include:doris_fe_query_latency_ms 、doris_fe_query_latency_ms_count、doris_fe_query_latency_ms_sum

Doris monitoring information:

# HELP doris_fe_query_latency_ms 
# TYPE doris_fe_query_latency_ms summary
doris_fe_query_latency_ms{quantile="0.75"} 0.0
doris_fe_query_latency_ms{quantile="0.95"} 0.0
doris_fe_query_latency_ms{quantile="0.98"} 0.0
doris_fe_query_latency_ms{quantile="0.99"} 0.0
doris_fe_query_latency_ms{quantile="0.999"} 0.0
doris_fe_query_latency_ms_sum {} 0.0
doris_fe_query_latency_ms_count {} 11
doris_fe_query_latency_ms{quantile="0.75",user="root"} 0.0
doris_fe_query_latency_ms{quantile="0.95",user="root"} 0.0
doris_fe_query_latency_ms{quantile="0.98",user="root"} 0.0
doris_fe_query_latency_ms{quantile="0.99",user="root"} 0.0
doris_fe_query_latency_ms{quantile="0.999",user="root"} 0.0
doris_fe_query_latency_ms_sum {user="root"} 0.0
doris_fe_query_latency_ms_count {user="root"} 11
# HELP doris_fe_txn_exec_latency_ms 
# TYPE doris_fe_txn_exec_latency_ms summary
doris_fe_txn_exec_latency_ms{quantile="0.75"} 0.0
doris_fe_txn_exec_latency_ms{quantile="0.95"} 0.0
doris_fe_txn_exec_latency_ms{quantile="0.98"} 0.0
doris_fe_txn_exec_latency_ms{quantile="0.99"} 0.0
doris_fe_txn_exec_latency_ms{quantile="0.999"} 0.0
doris_fe_txn_exec_latency_ms_sum {} 0.0
doris_fe_txn_exec_latency_ms_count {} 0
# HELP doris_fe_txn_publish_latency_ms 
# TYPE doris_fe_txn_publish_latency_ms summary
doris_fe_txn_publish_latency_ms{quantile="0.75"} 0.0
doris_fe_txn_publish_latency_ms{quantile="0.95"} 0.0
doris_fe_txn_publish_latency_ms{quantile="0.98"} 0.0
doris_fe_txn_publish_latency_ms{quantile="0.99"} 0.0
doris_fe_txn_publish_latency_ms{quantile="0.999"} 0.0
doris_fe_txn_publish_latency_ms_sum {} 0.0
doris_fe_txn_publish_latency_ms_count {} 0
tomsun28 commented 3 months ago

hi, it seem now does not support add name column. Welcome to update this implement your idea. We currently have three ways of parsing prometheus, the first is automatically parsed via promethes task, the second is via promql, and the third is the way you describe. Personally, I feel that the third way is not very conveniently designed, if you have a better way of defining it in your side of the implementation, you are welcome to optimize it.