BGP docker crashed during the test and it could not be recovered during the test.
Steps to reproduce the issue:
Issue reproduced during sonic-mgmt platform_tests/test_advanced_reboot.py::test_warm_reboot_sad test
BGP docker crashed during the test and it could not be recovered during the test:
2024 Nov 9 03:21:40.847546 arc-switch1004 INFO bgp#supervisord 2024-11-09 01:21:40,845 WARN exited: bgpd (terminated by SIGSEGV (core dumped); not expected)
Describe the results you received:
Logs:
2024 Nov 9 03:21:37.123288 sonic DEBUG bgp#bgpcfgd: execute command '['vtysh', '-f', '/tmp/tmpfcjtdazx']'.
2024 Nov 9 03:21:37.194414 sonic INFO sonic-ztp[4005]: ZTP is administratively disabled.
2024 Nov 9 03:21:37.445364 sonic CRIT bgp#BGP[60]: Received signal 11 at 1731115297 (si_addr 0x4, PC 0x7fdc3c18748c); aborting...
2024 Nov 9 03:21:37.449344 sonic CRIT bgp#BGP[60]: zlog_signal+0xf5 7fdc3c61f345 7ffcf5aed3b0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.449344 sonic CRIT bgp#BGP[60]: PBKDF2_SHA256+0x4b1 7fdc3c64cf81 7ffcf5aed4f0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.449344 sonic CRIT bgp#BGP[60]: __sigaction+0x40 7fdc3c2e2050 7ffcf5aed640 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7fdc3c2a6000)
2024 Nov 9 03:21:37.464006 sonic CRIT bgp#BGP[60]: ---- signal ----
2024 Nov 9 03:21:37.464006 sonic CRIT bgp#BGP[60]: ly_err_print+0xe1c 7fdc3c18748c 7ffcf5aedae0 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x3af0 7fdc3c21c5f0 7ffcf5aedb50 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x672d 7fdc3c21f22d 7ffcf5aedbb0 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x15a46 7fdc3c22e546 7ffcf5aedc20 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x1364b 7fdc3c22c14b 7ffcf5aedd00 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x12b62 7fdc3c22b662 7ffcf5aede90 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.464006 sonic CRIT bgp#BGP[60]: lys_ypr_ctx_get_level+0x17664 7fdc3c230164 7ffcf5aee020 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.469685 sonic CRIT bgp#BGP[60]: lyxp_get_expr+0x1a6 7fdc3c2307e6 7ffcf5aee090 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.469685 sonic CRIT bgp#BGP[60]: lyxp_get_expr+0x2a97 7fdc3c2330d7 7ffcf5aee1a0 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.469685 sonic CRIT bgp#BGP[60]: lyxp_get_expr+0x30d7 7fdc3c233717 7ffcf5aee240 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.469685 sonic CRIT bgp#BGP[60]: lyd_validate_all+0x42 7fdc3c2338e2 7ffcf5aee370 /lib/x86_64-linux-gnu/libyang.so.2 (mapped at 0x7fdc3c177000)
2024 Nov 9 03:21:37.469685 sonic CRIT bgp#BGP[60]: nb_candidate_commit_prepare+0x4e 7fdc3c62ea8e 7ffcf5aee3a0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.469685 sonic CRIT bgp#BGP[60]: nb_candidate_commit+0x47 7fdc3c62ed97 7ffcf5aee400 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.469685 sonic CRIT bgp#BGP[60]: nb_terminate+0x29f8 7fdc3c631c68 7ffcf5aee450 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.483873 sonic CRIT bgp#BGP[60]: nb_cli_pending_commit_check+0x28 7fdc3c631da8 7ffcf5af04b0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.483873 sonic CRIT bgp#BGP[60]: cmd_exit+0x28d 7fdc3c5f169d 7ffcf5af04d0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.483873 sonic CRIT bgp#BGP[60]: cmd_execute_command+0xd7 7fdc3c5f19f7 7ffcf5af0540 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.491285 sonic CRIT bgp#BGP[60]: cmd_execute+0xd0 7fdc3c5f1c10 7ffcf5af0590 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.491285 sonic CRIT bgp#BGP[60]: vty_set_include+0x197 7fdc3c664127 7ffcf5af05f0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.491285 sonic CRIT bgp#BGP[60]: vty_set_include+0x964 7fdc3c6648f4 7ffcf5af26a0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.491285 sonic CRIT bgp#BGP[60]: vty_close+0x1f08 7fdc3c667b48 7ffcf5af26e0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.491285 sonic CRIT bgp#BGP[60]: thread_call+0x7d 7fdc3c65ee2d 7ffcf5af2930 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.517219 sonic CRIT bgp#BGP[60]: frr_run+0xe8 7fdc3c617368 7ffcf5af29d0 /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 (mapped at 0x7fdc3c57c000)
2024 Nov 9 03:21:37.533015 sonic CRIT bgp#BGP[60]: main+0x36b 55e2769d238b 7ffcf5af2be0 /usr/lib/frr/bgpd (mapped at 0x55e2768e8000)
2024 Nov 9 03:21:37.533015 sonic CRIT bgp#BGP[60]: __libc_init_first+0x8a 7fdc3c2cd24a 7ffcf5af2c40 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7fdc3c2a6000)
2024 Nov 9 03:21:37.533015 sonic CRIT bgp#BGP[60]: __libc_start_main+0x85 7fdc3c2cd305 7ffcf5af2ce0 /lib/x86_64-linux-gnu/libc.so.6 (mapped at 0x7fdc3c2a6000)
2024 Nov 9 03:21:37.533015 sonic CRIT bgp#BGP[60]: _start+0x21 55e2769d4091 7ffcf5af2d30 /usr/lib/frr/bgpd (mapped at 0x55e2768e8000)
2024 Nov 9 03:21:37.533015 sonic CRIT bgp#BGP[60]: in thread vtysh_read scheduled from ../lib/vty.c:2740 vty_event()
Backtrace shows the issue comes from libyang hash table implementation:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007fdc3c330e9f in __pthread_kill_internal (signo=11, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 0x00007fdc3c2e1fb2 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
#3 0x00007fdc3c64cfbc in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#4 <signal handler called>
#5 0x00007fdc3c18748c in lyht_insert_with_resize_cb (ht=0x55e2784fb6d0, val_p=0x7ffcf5aedb5c, hash=3760792813, resize_val_equal=resize_val_equal@entry=0x0, match_p=0x0) at ./src/hash_table.c:697
#6 0x00007fdc3c187b5a in lyht_insert (ht=<optimized out>, val_p=<optimized out>, hash=<optimized out>, match_p=<optimized out>) at ./src/hash_table.c:746
#7 0x00007fdc3c21c5f0 in set_insert_node_hash (set=0x55e2784fc970, node=0x55e2784f5160, type=<optimized out>) at ./src/xpath.c:647
#8 0x00007fdc3c21f22d in moveto_node (set=set@entry=0x55e2784fc970, moveto_mod=0x55e277e02270, ncname=ncname@entry=0x55e277dead90 "entry", options=options@entry=2) at ./src/xpath.c:5603
#9 0x00007fdc3c22e546 in eval_name_test_with_predicate (options=2, set=<optimized out>, all_desc=<optimized out>, attr_axis=<optimized out>, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:7350
#10 eval_relative_location_path (exp=0x55e277e3a990, tok_idx=0x7ffcf5aee046, all_desc=<optimized out>, set=<optimized out>, options=2) at ./src/xpath.c:7522
#11 0x00007fdc3c22ae8d in eval_path_expr (options=21986, set=<optimized out>, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:8072
#12 0x00007fdc3c22c14b in eval_function_call (options=2, set=0x7ffcf5aee0d0, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:7772
#13 eval_path_expr (options=2, set=0x7ffcf5aee0d0, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:8002
#14 eval_expr_select (exp=exp@entry=0x55e277e3a990, tok_idx=tok_idx@entry=0x7ffcf5aee046, etype=etype@entry=LYXP_EXPR_OR, set=set@entry=0x7ffcf5aee0d0, options=options@entry=2) at ./src/xpath.c:8666
#15 0x00007fdc3c22b662 in eval_or_expr (options=2, set=0x7ffcf5aee0d0, repeat=<optimized out>, tok_idx=0x7ffcf5aee046, exp=0x55e277e3a990) at ./src/xpath.c:8558
#16 eval_expr_select (exp=exp@entry=0x55e277e3a990, tok_idx=tok_idx@entry=0x7ffcf5aee046, etype=etype@entry=LYXP_EXPR_NONE, set=set@entry=0x7ffcf5aee0d0, options=options@entry=2) at ./src/xpath.c:8642
#17 0x00007fdc3c230164 in lyxp_eval (ctx=0x55e277db3a00, exp=0x55e277e3a990, cur_mod=0x55e277e0e390, format=format@entry=LY_VALUE_SCHEMA_RESOLVED, prefix_data=<optimized out>, ctx_node=0x55e2784fa6c0, tree=0x55e277e2d6b0,
vars=<optimized out>, set=<optimized out>, options=<optimized out>) at ./src/xpath.c:8758
#18 0x00007fdc3c2307e6 in lyd_validate_node_when (tree=0x55e277e2c610, node=node@entry=0x55e2784fbd10, schema=<optimized out>, disabled=disabled@entry=0x7ffcf5aee1f0) at ./src/validation.c:153
#19 0x00007fdc3c2330d7 in lyd_validate_unres_when (diff=0x0, node_types=0x7ffcf5aee2e0, node_when=<optimized out>, mod=0x55e277e02270, tree=0x7ffcf5aee2d0) at ./src/validation.c:206
#20 lyd_validate_unres (tree=0x7ffcf5aee2d0, mod=0x55e277e02270, node_when=<optimized out>, node_exts=0x7ffcf5aee310, node_types=0x7ffcf5aee2e0, meta_types=0x7ffcf5aee2f0, diff=0x0) at ./src/validation.c:322
#21 0x00007fdc3c233717 in lyd_validate (tree=0x55e277dff110, module=module@entry=0x0, ctx=0x55e277db3a00, val_opts=1, validate_subtree=validate_subtree@entry=1 '\001', node_when_p=0x7ffcf5aee300, node_when_p@entry=0x0,
node_exts_p=0x7ffcf5aee310, node_types_p=0x7ffcf5aee2e0, meta_types_p=0x7ffcf5aee2f0, diff=0x0) at ./src/validation.c:1577
#22 0x00007fdc3c2338e2 in lyd_validate_all (tree=<optimized out>, ctx=<optimized out>, val_opts=<optimized out>, diff=<optimized out>) at ./src/validation.c:1604
#23 0x00007fdc3c62ea8e in nb_candidate_commit_prepare () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#24 0x00007fdc3c62ed97 in nb_candidate_commit () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#25 0x00007fdc3c631c68 in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#26 0x00007fdc3c631da8 in nb_cli_pending_commit_check () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#27 0x00007fdc3c5f169d in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#28 0x00007fdc3c5f19f7 in cmd_execute_command () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#29 0x00007fdc3c5f1c10 in cmd_execute () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
--Type <RET> for more, q to quit, c to continue without paging--
#30 0x00007fdc3c664127 in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#31 0x00007fdc3c6648f4 in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#32 0x00007fdc3c667b48 in ?? () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#33 0x00007fdc3c65ee2d in thread_call () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#34 0x00007fdc3c617368 in frr_run () from /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0
#35 0x000055e2769d238b in main ()
LY_ERR
lyht_insert_with_resize_cb(struct hash_table *ht, void *val_p, uint32_t hash, lyht_value_equal_cb resize_val_equal,
void **match_p)
...
/* insert it into the returned record */
assert(rec->hits < 1);
if (rec->hits < 0) {. <========= line crashed
--ht->invalid;
}
Describe the results you expected:
Output of show version:
SONiC Software Version: SONiC.202405_RC.45-28a64576c_Internal
SONiC OS Version: 12
Distribution: Debian 12.7
Kernel: 6.1.0-22-2-amd64
Build commit: 28a64576c
Build date: Thu Nov 7 06:41:16 UTC 2024
Output of show techsupport:
(paste your output here or download and attach the file here )
Additional information you deem important (e.g. issue happens only occasionally):
@StormLiangMS can you check with @qiluo-msft if this issue with libyang. As per @stepanblyschak this cannot be reproduced easily. please check the coredump
Description
BGP docker crashed during the test and it could not be recovered during the test.
Steps to reproduce the issue:
platform_tests/test_advanced_reboot.py::test_warm_reboot_sad
testDescribe the results you received:
Logs:
Backtrace shows the issue comes from libyang hash table implementation:
Crashes at dereference:
which corresponds to the code in libyang:
Describe the results you expected:
Output of
show version
:Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):
Core dump: bgpd.1731115297.60.core.gz