Closed Sparc0 closed 4 months ago
The first issue, gnmic_http_loader_number_of_loaded_targets not changing seems to be a bug.
As for adding an up metric, I'm not sure I get it. You want a metric that shows the state of each subscription per target? 0 for down, 1 for up? If it's the case I was hoping this could be derived from the existing metrics. Also, I'm not sure what you mean by pushed to each output.
I found some time to test the http loader metrics, I can see both number of loaded targets and deleted targets change every interval there is a change. Maybe the confusion comes from the fact that both those metrics are Gauges and not Counters ?
gnmic_http_loader_number_of_loaded_targets is always 0 even if i have successful grpc subscription ongoing that i loaded using http.
I would like if possible to get metrics on when i am unable to subscribe or connect. Maybe this metric could be pushed to each output as metric up also to follow Prometheus default generated metrics. e.g following events.
[gnmic] failed to initialize target "switch1": 10.10.10.50:50051: context deadline exceeded
[gnmic] retrying target "switch1" in 10s
This metric grpc_client_handled_total could it be made more detailed to include what target and what subscription also?
grpc_client_handled_total{grpc_code="OutOfRange",grpc_method="Subscribe",grpc_service="gnmi.gNMI",grpc_type="bidi_stream"} 5776
[gnmic] target "switch2": subscription system_resources rcv error: rpc error: code = OutOfRange desc = Maximum number of Subscribe requests reached
[gnmic] target "switch2": subscription system_resources rcv error: retrying in 10s