Open JackyYangPassion opened 1 month ago
To resolve the issue with missing JRaft metrics during the HStore Spring Actuator Metrics Sink initialization, you need to ensure that the metrics are registered in time during the initial setup. The provided code already includes a mechanism to register metrics, but it might be missing the timing or sequence to ensure all metrics are registered before they are accessed.
Here is a step-by-step approach to ensure metrics are registered properly:
Ensure init
is called early: Make sure the JRaftMetrics.init(meterRegistry)
method is called early in your application startup process, before any metrics are accessed.
Check for null or empty metrics map: The getRaftGroupMetrics
method returns an empty map if the metrics are not available. Ensure that the metrics are available before proceeding with registration.
Force registration if metrics are not available initially: You can add a retry mechanism to attempt registration if the metrics are not available initially.
Here is an updated version of the init
method with a retry mechanism:
public synchronized static void init(MeterRegistry meterRegistry) {
if (registry == null) {
registry = meterRegistry;
registerMeters();
// Retry mechanism to ensure metrics are registered
int retries = 5;
while (retries-- > 0 && groupSet.isEmpty()) {
try {
Thread.sleep(1000); // Wait for 1 second before retrying
registerNodeMetrics();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
break;
}
}
}
}
This code ensures that the registerNodeMetrics
method is called multiple times with a delay, allowing time for the metrics to be available and registered properly. Adjust the number of retries and the delay as needed based on your application's startup time and metrics availability [1].
To continue talking to Dosu, mention @dosu.
Bug Type (问题类型)
logic (逻辑设计问题)
Before submit
Environment (环境信息)
Expected & Actual behavior (期望与实际表现)
期望结果
通过Spring Actuator 接口获取正确的 JRaft 监控指标 curl http://ip:8620/actuator/prometheus
BUG 详情
HStore 节点初始化后,通过prometheus 定时拉取指标,此时会造成指标丢失 BUG根本原因是初始化仅一次,JRaft 有些埋点暂时还未注册,因此造成缺失 具体错误逻辑代码如下
具体指标
Vertex/Edge example (问题点 / 边数据举例)
No response
Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)
No response