openconfig / gnmic

gNMIc is a gNMI CLI client and collector
https://gnmic.openconfig.net
Apache License 2.0
171 stars 55 forks source link

gnmic api metrics #12

Closed Sparc0 closed 4 months ago

Sparc0 commented 1 year ago

gnmic_http_loader_number_of_loaded_targets is always 0 even if i have successful grpc subscription ongoing that i loaded using http.

I would like if possible to get metrics on when i am unable to subscribe or connect. Maybe this metric could be pushed to each output as metric up also to follow Prometheus default generated metrics. e.g following events.

[gnmic] failed to initialize target "switch1": 10.10.10.50:50051: context deadline exceeded [gnmic] retrying target "switch1" in 10s

This metric grpc_client_handled_total could it be made more detailed to include what target and what subscription also? grpc_client_handled_total{grpc_code="OutOfRange",grpc_method="Subscribe",grpc_service="gnmi.gNMI",grpc_type="bidi_stream"} 5776

[gnmic] target "switch2": subscription system_resources rcv error: rpc error: code = OutOfRange desc = Maximum number of Subscribe requests reached [gnmic] target "switch2": subscription system_resources rcv error: retrying in 10s

karimra commented 1 year ago

The first issue, gnmic_http_loader_number_of_loaded_targets not changing seems to be a bug.

As for adding an up metric, I'm not sure I get it. You want a metric that shows the state of each subscription per target? 0 for down, 1 for up? If it's the case I was hoping this could be derived from the existing metrics. Also, I'm not sure what you mean by pushed to each output.

karimra commented 1 year ago

I found some time to test the http loader metrics, I can see both number of loaded targets and deleted targets change every interval there is a change. Maybe the confusion comes from the fact that both those metrics are Gauges and not Counters ?