openconfig / gnmic

gNMIc is a gNMI CLI client and collector
https://gnmic.openconfig.net
Apache License 2.0
171 stars 55 forks source link

[gnmic] gRPC dialout receive error #519

Open mnorman111 opened 6 days ago

mnorman111 commented 6 days ago

Hi!

I'm geting errors [gnmic] gRPC dialout receive error: rpc error: code = Internal desc = grpc: failed to unmarshal the received message: proto: cannot parse invalid wire-format data this seems to cause gnmic to eat all available memory and become unresponsive. Happens about once a day.

karimra commented 5 days ago

Can you share some details on your config, setup and what you are trying to achieve ?

mnorman111 commented 3 days ago

gnmic is receiving telemetry data from about 500 Nokia routers and inserts it to influxdb2 typical memory usage Top 3 processes by memory consumption: PID COMMAND %MEM 187366 gnmic 23.0 936 influxd 22.4

occasionally Top 3 processes by memory consumption: PID COMMAND %MEM 187366 gnmic 92.3 936 influxd 2.0

and then it stops receiving any data. when it happens there is log 2024/09/16 12:21:43.872233 [gnmic] gRPC dialout receive error: rpc error: code = Internal desc = grpc: failed to unmarshal the received message: proto: cannot parse invalid wire-format data

gnmic config:

global: address: 0.0.0.0:57400 log-file: /var/log/gnmic.log gnmi-server:

the address the gNMI server will listen to

address: 0.0.0.0:57400 insecure: true max-subscriptions: 512 max-unary-rpc: 512 min-sample-interval: 1ms default-sample-interval: 60s min-heartbeat-interval: 1s enable-metrics: true debug: false

cache:

type: oc

address:

username:

password:

expiration: 60s

debug: false

max-bytes:

max-msgs-per-subscription:

fetch-batch-size:

fetch-wait-time:

outputs: influx-live: type: influxdb url: http://localhost:8086 org: Telemetry bucket: MPLS-telem token: xxxxxx batch-size: 1000 flush-timer: 10s use-gzip: false enable-tls: false tls: ca-file: cert-file: key-file: skip-verify: false override-timestamps: false timestamp-precision: ns health-check-period: 30s debug: false add-target: "overwrite" target-template: '{{ index . "subscription-name" }}' enable-metrics: false event-processors:

node config: destination-group "telem" create allow-unsecure-connection tcp-keepalive shutdown exit destination 10.255.10.11 port 57400 create exit exit sensor-groups sensor-group "stats" create path "/state/lag/port-scheduler-policy/statistics" create exit path "/state/lag/statistics" create exit path "/state/port[port-id=]/egress/voq" create exit path "/state/port[port-id=]/statistics" create exit path "/state/port[port-id=*]/transceiver" create exit path "/state/router/interface" create exit path "/state/service/epipe" create exit path "/state/service/vpls" create exit path "/state/service/vprn/interface" create exit path "/state/system/ptp/instance/current-ds/mean-path-delay" create exit path "/state/system/ptp/instance/current-ds/offset-from-master" create exit exit exit persistent-subscriptions subscription "aasa-ixreb1" create destination-group "telem" encoding proto mode sample sample-interval 60000 sensor-group "stats"