sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
734 stars 1.41k forks source link

GNMI CLI Subscribes With No Namespace Results GNMI Crashes on Multi-Asic Chassis #20510

Open wumiaont opened 3 days ago

wumiaont commented 3 days ago

Description

We are using multi-asic chassis (Nokia-7250) to test telemetry. It is found if I provide the following CLI from ptf server without providing namespace information, GNMI container crashes and restarts.

python /root/gnxi/gnmi_cli_py/py_gnmicli.py -g -t 10.250.6.233 -p 50052 -m subscribe -x NEIGH_STATE_TABLE -xt STATE_DB -o ndastreamingservertest --subscribe_mode 0 --submode 1 --interval 0 --update_count 2 --create_connections 1

Sending SubscribeRequest subscribe { prefix { target: "STATE_DB" } subscription { path { elem { name: "NEIGH_STATE_TABLE" } } mode: ON_CHANGE } }

Received an exception from server side and error message is: '<_MultiThreadedRendezvous of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "Connection reset by peer" debug_error_string = "{"created":"@1729002509.589264167","description":"Error received from peer ipv4:10.250.6.233:50052","file":"src/core/lib/surface/call.cc","file_line":1070,"grpc_message":"Connection reset by peer","grpc_status":14}"

'. Client receives an exception 'Connection reset by peer' indicating gNMI server is shut down and Exiting ...

Look at chassis: GNMI container restarts by that command.

root@ixre-egl-board30:/# admin@ixre-egl-board30:~$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES c93e2246ff1f docker-snmp:latest "/usr/local/bin/supe?" 2 weeks ago Up 26 minutes snmp c4c5e555d8d9 docker-sonic-mgmt-framework:latest "/usr/local/bin/supe?" 2 weeks ago Up 27 minutes mgmt-framework f2784babf897 docker-lldp:latest "/usr/bin/docker-lld?" 2 weeks ago Up 27 minutes lldp1 ec0b77ec803f docker-lldp:latest "/usr/bin/docker-lld?" 2 weeks ago Up 27 minutes lldp0 fbec90b49288 docker-sonic-gnmi:latest "/usr/local/bin/supe?" 2 weeks ago Exited (0) 4 seconds ago gnmi

Steps to reproduce the issue:

  1. Using multi-asic chassis. Config GNMI to have certs and correct ports and no client_auth.
  2. Restart GNMI service.
  3. Run "python /root/gnxi/gnmi_cli_py/py_gnmicli.py -g -t 10.250.6.233 -p 50052 -m subscribe -x NEIGH_STATE_TABLE -xt STATE_DB -o ndastreamingservertest --subscribe_mode 0 --submode 1 --interval 0 --update_count 2 --create_connections 1" from PRF container
  4. Watch chassis GNMI container crashes and restarts.
  5. If you run with namespace associated then it will be fine. Such as "python /root/gnxi/gnmi_cli_py/py_gnmicli.py -g -t 10.250.6.233 -p 50052 -m subscribe -x NEIGH_STATE_TABLE -xt STATE_DB/asic0 -o ndastreamingservertest --subscribe_mode 0 --submode 1 --interval 0 --update_count 2 --create_connections 1

Describe the results you received:

GNMI crashes.

Describe the results you expected:

No crash no matter what wrong CLI input provided.

Output of show version:

2405

(paste your output here)
wumiaont commented 2 days ago

Log from gnmi.log when that's happening.

2024 Oct 16 15:52:03.976726 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native I1016 15:52:03.976540 21 db_client.go:675] Invalid db table Path STATE_DB NEIGH_STATE_TABLE 2024 Oct 16 15:52:03.976726 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native I1016 15:52:03.976566 21 connection_manager.go:81] Closing connection: 10.250.6.253:56678|STATE_DB|2024-10-16T1 5:52:03Z 2024 Oct 16 15:52:03.976938 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native I1016 15:52:03.976680 21 panic.go:890] Client 10.250.6.253:56678 shutdown 2024 Oct 16 15:52:03.979007 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native panic: runtime error: invalid memory address or nil pointer dereference 2024 Oct 16 15:52:03.979007 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 2024 Oct 16 15:52:03.979201 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native pc=0xe54b89] 2024 Oct 16 15:52:03.979222 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native 2024 Oct 16 15:52:03.979222 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native goroutine 117 [running]: 2024 Oct 16 15:52:03.979317 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native github.com/sonic-net/sonic-gnmi/gnmi_server.(Client).Run(0xc00019e870, {0x12f8958?, 0xc0004d4160}) 2024 Oct 16 15:52:03.979317 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native #011/sonic/src/sonic-gnmi/gnmi_server/client_subscribe.go:182 +0xac9 2024 Oct 16 15:52:03.979317 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native github.com/sonic-net/sonic-gnmi/gnmi_server.(Server).Subscribe(0xc00094c550, {0x12f8958, 0xc0004d4160}) 2024 Oct 16 15:52:03.979356 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native #011/sonic/src/sonic-gnmi/gnmi_server/server.go:320 +0x3ab 2024 Oct 16 15:52:03.979391 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native github.com/openconfig/gnmi/proto/gnmi._GNMI_Subscribe_Handler({0x10f0fe0 2024 Oct 16 15:52:03.979517 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native ?, 0xc00094c550}, {0x12f4c68?, 0xc000552600}) 2024 Oct 16 15:52:03.979627 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native #011/sonic/src/sonic-gnmi/vendor/github.com/openconfig/gnmi/proto/gnmi/gnmi.pb.go:3412 +0x9f 2024 Oct 16 15:52:03.979669 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native google.golang.org/grpc.(Server).processStreamingRPC(0xc0004ee000, {0x12f9378, 0xc000758480}, 0xc000206000, 0xc0000f ae40, 0x19a4200, 0x0) 2024 Oct 16 15:52:03.979726 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native #011/sonic/src/sonic-gnmi/vendor/google.golang.org/grpc/server.go:1457 +0xd4a 2024 Oct 16 15:52:03.979726 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native google.golang.org/grpc.(Server).handleStream(0xc0004ee000, {0x12f9378, 0xc000758480}, 0xc000206000, 0x0) 2024 Oct 16 15:52:03.979726 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native #011/sonic/src/sonic-gnmi/vendor/google.golang.org/grpc/server.go:1537 +0x9ea 2024 Oct 16 15:52:03.979821 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native google.golang.org/grpc.(Server).serveStreams.func1.2() 2024 Oct 16 15:52:03.979821 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native #011/sonic/src/sonic-gnmi/vendor/google.golang.org/grpc/server.go:871 +0x98 2024 Oct 16 15:52:03.979821 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native created by google.golang.org/grpc.(Server).serveStreams.func1 2024 Oct 16 15:52:03.979847 ixre-egl-board30 INFO gnmi#supervisord: gnmi-native #011/sonic/src/sonic-gnmi/vendor/google.golang.org/grpc/server.go:869 +0x28a

wumiaont commented 1 day ago

https://github.com/sonic-net/sonic-gnmi/pull/309