Closed EasyNetDev closed 3 years ago
Can you provide a pcap of an example session where this occurs?
And show running
would be useful as well. I assume you have BGP dampening enabled.
Hi,
Sure. Last night I've compiled 8.0-dev for one of the routers and keep 8.1-dev on other one. Even 8.0-dev is crashing:
Jul 14 03:48:09 R02 BGP[22377]: in thread bgp_process_packet scheduled from bgpd/bgp_io.c:270 bgp_process_reads()
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/frr/bgpd(_start+0x2a) [0x56376c92bdaa]
Jul 14 03:48:09 R02 BGP[22377]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7f71584ecd0a]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/frr/bgpd(main+0x356) [0x56376c92a136]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(frr_run+0xe8) [0x7f715889e198]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(thread_call+0xf3) [0x7f71588df023]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/frr/bgpd(bgp_process_packet+0x466) [0x56376c97edc6]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/frr/bgpd(+0x12b058) [0x56376c97c058]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/frr/bgpd(bgp_nlri_parse_ip+0xb7) [0x56376c997207]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/frr/bgpd(bgp_update+0x1921) [0x56376c996221]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/frr/bgpd(bgp_damp_update+0x17c) [0x56376ca392ac]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/frr/bgpd(+0x1e7bcb) [0x56376ca38bcb]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/frr/bgpd(+0x1e78f6) [0x56376ca388f6]
Jul 14 03:48:09 R02 BGP[22377]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f715869f140]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0xc2b11) [0x7f71588ceb11]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_signal+0xf5) [0x7f71588a5225]
Jul 14 03:48:09 R02 BGP[22377]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_backtrace_sigsafe+0x6d) [0x7f71588a502d]
Jul 14 03:48:09 R02 BGP[22377]: Received signal 11 at 1626223689 (si_addr 0x0, PC 0x56376ca388f6); aborting...
My configs that I'm running: R01: https://nextcloud.easynet.dev/index.php/s/KEQZ39stJ4i6ktb R02: https://nextcloud.easynet.dev/index.php/s/Ao8jJTxWCc6r4kZ
Now I will try to do a tcpdump over my all BGP sessions and I'll post them.
Is ok?
Could you disable bgp dampening
just to make sure it doesn't crash without that?
By the way, would it be possible to get a full coredump?
@ton31337,
Sure, I've set my router to dump the core. Here are the PCAPs: R01: https://nextcloud.easynet.dev/index.php/s/Zcm6Xbnn9ZsY94Q R02: https://nextcloud.easynet.dev/index.php/s/qm5fLCt7QTwLCef
Crash happend at 09:46:40 for R01 and 09:46:40 for R02, EEST / Bucharest time.
Thanks, I'm waiting for coredump, that would be the best thing to figure out what's the problem here.
Thanks, I'm waiting for coredump, that would be the best thing to figure out what's the problem here.
Yep. I'm waiting for the next crash. I'm also doing tcpdump again for this crash. I set the "core" limit to "unlimited" for BGP process. I hope it will dump the core.
Could you disable
bgp dampening
just to make sure it doesn't crash without that?
I will try to disable it after this crash.
Ok, got the crash and core dump:
Jul 14 13:15:29 R01 BGP[64622]: in thread bgp_process_packet scheduled from bgpd/bgp_packet.c:2676 bgp_process_packet()
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/frr/bgpd(_start+0x2a) [0x55fdef3f308a]
Jul 14 13:15:29 R01 BGP[64622]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7f1f43b2ad0a]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/frr/bgpd(main+0x38e) [0x55fdef3f134e]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(frr_run+0xe8) [0x7f1f43fd95a8]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(thread_call+0x7d) [0x7f1f4401b2ad]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/frr/bgpd(bgp_process_packet+0x466) [0x55fdef448916]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/frr/bgpd(+0x192ba8) [0x55fdef445ba8]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/frr/bgpd(bgp_nlri_parse_ip+0xb7) [0x55fdef461da7]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/frr/bgpd(bgp_update+0x1a15) [0x55fdef460b75]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/frr/bgpd(bgp_damp_update+0x17c) [0x55fdef5255dc]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/frr/bgpd(+0x271efb) [0x55fdef524efb]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/frr/bgpd(+0x271c26) [0x55fdef524c26]
Jul 14 13:15:29 R01 BGP[64622]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f1f43cdd140]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0xc5651) [0x7f1f4400a651]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_signal+0xf5) [0x7f1f43fe0635]
Jul 14 13:15:29 R01 BGP[64622]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_backtrace_sigsafe+0x6d) [0x7f1f43fe043d]
Jul 14 13:15:29 R01 BGP[64622]: Received signal 11 at 1626257729 (si_addr 0x0, PC 0x55fdef524c26); aborting...
Here is the core dump for R01: https://nextcloud.easynet.dev/index.php/s/PHCqmLXQccxtL78 Here is the pcap for this crash: https://nextcloud.easynet.dev/index.php/s/XxbeSHif5qyp7Gz And here is my frr-8.1-dev build: https://nextcloud.easynet.dev/index.php/s/6c2HcjHYC5tkFaC
I have also the debug symbols installed on my systems.
I disabled the bgp dumpening
.
Could you run on your machine and paste the output here?:
gdb -batch -ex 'bt full' /usr/lib/frr/bgpd /<path_to_coredump>/core-bgpd-11-115-124-64622-1626257729
Could you run on your machine and paste the output here?:
gdb -batch -ex 'bt full' /usr/lib/frr/bgpd /<path_to_coredump>/core-bgpd-11-115-124-64622-1626257729
Sure. Here it is:
# gdb -batch -ex 'bt full' /usr/lib/frr/bgpd /opt/coredump/core-bgpd-11-115-124-64622-1626257729
[New LWP 64622]
[New LWP 64623]
[New LWP 64630]
[New LWP 64624]
[New LWP 64625]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/bgpd -d -F traditional -A 127.0.0.1 -M rpki'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f1f43a1d580 (LWP 64622))]
#0 raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
set = {__val = {18446744067266829055, 140733737640664, 167, 0, 0, 0, 7018141387277233769, 8246195854090838116, 7021216768532505455, 139772261448503, 140733737640624, 139772261448442, 3611922223501156384, 3834033537019820080, 8097313801230169649, 8097317594494889842}}
pid = <optimized out>
tid = <optimized out>
ret = <optimized out>
#1 0x00007f1f4400a68c in core_handler (signo=11, siginfo=0x7fff2070a5f0, context=<optimized out>) at lib/sigevent.c:262
pc = 0x55fdef524c26 <bgp_reuselist_del+54>
sa_default = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x0}
sigset = {__val = {9216, 0 <repeats 15 times>}}
#2 <signal handler called>
No locals.
#3 0x000055fdef524c26 in bgp_reuselist_del (list=0x55fdf13e8e90, node=0x7fff2070aa70) at bgpd/bgp_damp.c:57
curelm = 0x55fe240eb2a0
__func__ = "bgp_reuselist_del"
#4 0x000055fdef524efb in bgp_reuse_list_delete (bdi=<optimized out>, bdc=<optimized out>, bdc=<optimized out>) at bgpd/bgp_damp.c:186
list = 0x55fdf13e8e90
rn = 0x55fe329ba7f0
#5 0x000055fdef5255dc in bgp_damp_update (path=path@entry=0x55fe22b4f020, dest=dest@entry=0x55fe22b4ef20, afi=afi@entry=AFI_IP, safi=safi@entry=SAFI_UNICAST) at bgpd/bgp_damp.c:423
t_now = <optimized out>
bdi = 0x55fe27c03bd0
status = <optimized out>
bdc = 0x55fdf11d5910
__func__ = {<optimized out> <repeats 16 times>}
#6 0x000055fdef460b75 in bgp_update (peer=peer@entry=0x7f1f40c92010, p=p@entry=0x7fff2070ae70, addpath_id=addpath_id@entry=0, attr=0x7fff2070af80, afi=afi@entry=AFI_IP, safi=safi@entry=SAFI_UNICAST, type=<optimized out>, sub_type=<optimized out>, prd=0x0, label=0x0, num_labels=<optimized out>, soft_reconfig=<optimized out>, evpn=0x0) at bgpd/bgp_route.c:4075
ret = <optimized out>
aspath_loop_count = <optimized out>
dest = 0x55fe22b4ef20
bgp = 0x55fdf11d3040
new_attr = {aspath = 0x55fe13e32ab0, community = 0x55fdfa9b75d0, refcnt = 0, flag = 151, nexthop = {s_addr = 4186198876}, med = 0, local_pref = 300, nh_ifindex = 0, origin = 0 '\000', pmsi_tnl_type = PMSI_TNLTYPE_NO_INFO, rmap_change_flags = 0, mp_nexthop_global = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, mp_nexthop_local = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, nh_lla_ifindex = 0, ecommunity = 0x0, ipv6_ecommunity = 0x0, lcommunity = 0x0, cluster1 = 0x0, transit = 0x0, mp_nexthop_global_in = {s_addr = 0}, aggregator_addr = {s_addr = 0}, originator_id = {s_addr = 0}, weight = 0, aggregator_as = 0, mp_nexthop_len = 0 '\000', mp_nexthop_prefer_global = 0 '\000', sticky = 0 '\000', default_gw = 0 '\000', router_flag = 0 '\000', es_flags = 0 '\000', tag = 0, label_index = 4294967295, label = 4294836223, srv6_vpn = 0x0, srv6_l3vpn = 0x0, encap_tunneltype = 0, encap_subtlvs = 0x0, vnc_subtlvs = 0x0, evpn_overlay = {type = OVERLAY_INDEX_TYPE_NONE, eth_s_id = {val = "\000\000\000\000\000\000\000\000\000"}, gw_ip = {ipv4 = {s_addr = 0}, ipv6 = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}, mm_seqnum = 0, mm_sync_seqnum = 0, rmac = {octet = "\000\000\000\000\000"}, distance = 0 '\000', rmap_table_id = 0, link_bw = 0, esi = {val = "\000\000\000\000\000\000\000\000\000"}, srte_color = 0, df_pref = 0, df_alg = 0 '\000'}
attr_new = 0x55fe395949a0
pi = <optimized out>
new = <optimized out>
extra = <optimized out>
reason = <optimized out>
pfx_buf = "@\255p \377\177\000\000X\255p \377\177\000\000@\255p \377\177\000\000\220o\233\372\375U\000\000\327\017\027K\376U\000\000\b\000\000\000\000\000\000\000\300\000\000\000\000\000\000\000\377\331?\357\375U\000\000\b\000\000\000\000\000\000\000\300\000\000\000\000\000\000\000p\255p \377\177\000\000<\335?\357\375U", '\000' <repeats 18 times>, "\020 \311@\037\177\000\000\020 \311@\037\177\000\000\260\255p \377\177\000\000\034\336?\357\375U\000\000\260\255p \377\177\000\000\000\246\366҅W\202\344\332\017\027K\376U\000\000\200\257p \377\177\000\000\020 \311@\037\177\000\000\327\017\027K"
connected = 0
do_loop_check = <optimized out>
has_valid_label = <optimized out>
nh_afi = <optimized out>
pi_type = <optimized out>
pi_sub_type = <optimized out>
vnc_implicit_withdraw = <optimized out>
same_attr = <optimized out>
__func__ = "bgp_update"
pfxprint = {<optimized out> <repeats 80 times>}
label_decoded = <optimized out>
#7 0x000055fdef461da7 in bgp_nlri_parse_ip (peer=peer@entry=0x7f1f40c92010, attr=attr@entry=0x7fff2070af80, packet=0x7fff2070af20) at bgpd/bgp_route.c:5508
pnt = 0x55fe4b17102f "O\216=\030O\216\062\030O\216\066\030.\023-\030O\216<\025O\216\060\030O\216\070\030O\216\063\030Y\333\r\030\271\022\376\030O\216:\030O\216\067\030O\216\071\030m\243\303\030.\023*\030.\023(\030.\023)\030.\023/\030.\023+\030O\216;\030Y\333\f蒷\267"
lim = 0x55fe4b171082 "蒷\267"
p = {family = 2 '\002', prefixlen = 24, u = {prefix = 79 'O', prefix4 = {s_addr = 4034127}, prefix6 = {__in6_u = {__u6_addr8 = "O\216=", '\000' <repeats 12 times>, __u6_addr16 = {36431, 61, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {4034127, 0, 0, 0}}}, lp = {id = {s_addr = 4034127}, adv_router = {s_addr = 0}}, prefix_eth = {octet = "O\216=\000\000"}, val = "O\216=", '\000' <repeats 12 times>, val32 = {4034127, 0, 0, 0}, ptr = 4034127, prefix_evpn = {route_type = 79 'O', u = {_ead_addr = {esi = {val = "\000\000\000\000\000\000\000\000\000"}, eth_tag = 0, ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}, _macip_addr = {eth_tag = 0, ip_prefix_length = 0 '\000', mac = {octet = "\000\000\000\000\000"}, ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}, _imet_addr = {eth_tag = 0, ip_prefix_length = 0 '\000', ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}, _es_addr = {esi = {val = "\000\000\000\000\000\000\000\000\000"}, ip_prefix_length = 0 '\000', ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}, _prefix_addr = {eth_tag = 0, ip_prefix_length = 0 '\000', ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}}}, prefix_flowspec = {family = 79 'O', prefixlen = 61, ptr = 0}}}
psize = <optimized out>
ret = <optimized out>
afi = AFI_IP
safi = SAFI_UNICAST
addpath_encoded = 0
addpath_id = 0
__func__ = {<optimized out> <repeats 18 times>}
#8 0x000055fdef4450d3 in bgp_nlri_parse (peer=peer@entry=0x7f1f40c92010, attr=attr@entry=0x7fff2070af80, packet=packet@entry=0x7fff2070af20, mp_withdraw=mp_withdraw@entry=0) at bgpd/bgp_packet.c:311
No locals.
#9 0x000055fdef445ba8 in bgp_update_receive (peer=peer@entry=0x7f1f40c92010, size=size@entry=207) at bgpd/bgp_packet.c:1720
i = 0
ret = <optimized out>
nlri_ret = <optimized out>
end = <optimized out>
s = <optimized out>
attr = {aspath = 0x55fe13e32ab0, community = 0x55fdfa9b6f90, refcnt = 0, flag = 135, nexthop = {s_addr = 4186198876}, med = 0, local_pref = 0, nh_ifindex = 0, origin = 0 '\000', pmsi_tnl_type = PMSI_TNLTYPE_NO_INFO, rmap_change_flags = 0, mp_nexthop_global = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, mp_nexthop_local = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, nh_lla_ifindex = 0, ecommunity = 0x0, ipv6_ecommunity = 0x0, lcommunity = 0x0, cluster1 = 0x0, transit = 0x0, mp_nexthop_global_in = {s_addr = 0}, aggregator_addr = {s_addr = 0}, originator_id = {s_addr = 0}, weight = 0, aggregator_as = 0, mp_nexthop_len = 0 '\000', mp_nexthop_prefer_global = 0 '\000', sticky = 0 '\000', default_gw = 0 '\000', router_flag = 0 '\000', es_flags = 0 '\000', tag = 0, label_index = 4294967295, label = 4294836223, srv6_vpn = 0x0, srv6_l3vpn = 0x0, encap_tunneltype = 0, encap_subtlvs = 0x0, vnc_subtlvs = 0x0, evpn_overlay = {type = OVERLAY_INDEX_TYPE_NONE, eth_s_id = {val = "\000\000\000\000\000\000\000\000\000"}, gw_ip = {ipv4 = {s_addr = 0}, ipv6 = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}, mm_seqnum = 0, mm_sync_seqnum = 0, rmac = {octet = "\000\000\000\000\000"}, distance = 0 '\000', rmap_table_id = 0, link_bw = 0, esi = {val = "\000\000\000\000\000\000\000\000\000"}, srte_color = 0, df_pref = 0, df_alg = 0 '\000'}
attribute_len = <optimized out>
update_len = 164
withdraw_len = 0
restart = false
NLRI_UPDATE = NLRI_UPDATE
NLRI_WITHDRAW = NLRI_WITHDRAW
NLRI_MP_UPDATE = NLRI_MP_UPDATE
NLRI_MP_WITHDRAW = NLRI_MP_WITHDRAW
NLRI_TYPE_MAX = NLRI_TYPE_MAX
nlris = {{afi = 1, safi = 1 '\001', nlri = 0x55fe4b170fde "\030m\243\306\030.\023.\030\271\022\374\030\271\371\254\030\271\371\255\030m\243\302\030m\243\307\030\271\022\377\030Y\333\n\030Y\333\017\030Y\333\021\030Y\333\b\030Y\333\t\030Y\333\016\030m\243\304\030\227\354\300\030\227\354\305\030O\216\065\030Y\333\v\030O\216>\030O\216=\030O\216\062\030O\216\066\030.\023-\030O\216<\025O\216\060\030O\216\070\030O\216\063\030Y\333\r\030\271\022\376\030O\216:\030O\216\067\030O\216\071\030m\243\303\030.\023*\030.\023(\030.\023)\030.\023/\030.\023+\030O\216;\030Y\333\f蒷\267", length = 164}, {afi = 0, safi = 0 '\000', nlri = 0x0, length = 0}, {afi = 0, safi = 0 '\000', nlri = 0x0, length = 0}, {afi = 0, safi = 0 '\000', nlri = 0x0, length = 0}}
__func__ = "bgp_update_receive"
attr_parse_ret = <optimized out>
#10 0x000055fdef448916 in bgp_process_packet (thread=<optimized out>) at bgpd/bgp_packet.c:2585
type = 2 '\002'
xref_p_100 = 0x55fdef663f20 <_xref.132>
size = 207
notify_data_length = {<optimized out>, <optimized out>}
_xrefdata = {xref = 0x55fdef663f20 <_xref.132>, uid = "TJQQE-0PPJT\000\000\000\000", hashstr = 0x55fdef577ce8 "%s: BGP NOTIFY receipt failed for peer: %s", hashu32 = {3, 33554456}}
_xref = {xref = {xrefdata = 0x55fdef6e8ac0 <_xrefdata.123>, type = XREFT_LOGMSG, line = 2598, file = 0x55fdef578159 "bgpd/bgp_packet.c", func = 0x55fdef5784f0 <__func__.134> "bgp_process_packet"}, fmtstring = 0x55fdef577ce8 "%s: BGP NOTIFY receipt failed for peer: %s", priority = 3, ec = 33554456, args = 0x55fdef55688b "__func__, peer->host"}
_xrefdata = {xref = 0x55fdef663ee0 <_xref.131>, uid = "YWXN7-Q2X5C\000\000\000\000", hashstr = 0x55fdef577d18 "%s: BGP KEEPALIVE receipt failed for peer: %s", hashu32 = {3, 33554457}}
_xref = {xref = {xrefdata = 0x55fdef6e8b00 <_xrefdata.124>, type = XREFT_LOGMSG, line = 2610, file = 0x55fdef578159 "bgpd/bgp_packet.c", func = 0x55fdef5784f0 <__func__.134> "bgp_process_packet"}, fmtstring = 0x55fdef577d18 "%s: BGP KEEPALIVE receipt failed for peer: %s", priority = 3, ec = 33554457, args = 0x55fdef55688b "__func__, peer->host"}
xref_p_101 = 0x55fdef663ee0 <_xref.131>
peer = 0x7f1f40c92010
rpkt_quanta_old = <optimized out>
fsm_update_result = <optimized out>
mprc = <optimized out>
processed = 0
__func__ = "bgp_process_packet"
#11 0x00007f1f4401b2ad in thread_call (thread=thread@entry=0x7fff2070b240) at lib/thread.c:1919
before = {cpu = {tv_sec = 221, tv_nsec = 370082814}, real = {tv_sec = 131400, tv_usec = 630166}}
after = {cpu = {tv_sec = 221, tv_nsec = 370076628}, real = {tv_sec = 131400, tv_usec = 630160}}
cputime_enabled_here = true
walltime = <optimized out>
cputime = 0
exp = <optimized out>
__func__ = {<optimized out> <repeats 12 times>}
#12 0x00007f1f43fd95a8 in frr_run (master=0x55fdf0811100) at lib/libfrr.c:1161
instanceinfo = '\000' <repeats 63 times>
__func__ = "frr_run"
thread = {type = 4 '\004', add_type = 3 '\003', threaditem = {si = {next = 0x0}}, timeritem = {hi = {index = 0}}, ref = 0x7f1f40d33d50, master = 0x55fdf0811100, func = 0x55fdef4484b0 <bgp_process_packet>, arg = 0x7f1f40c92010, u = {val = 0, fd = 0, sands = {tv_sec = 0, tv_usec = 0}}, real = {tv_sec = 131400, tv_usec = 630166}, hist = 0x7f1f3c006e70, yield = 10000, xref = 0x55fdef663de0 <_xref.127>, mtx = pthread_mutex_t = {Type = Normal, Status = Not acquired, Robust = No, Shared = No, Protocol = None}}
#13 0x000055fdef3f134e in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:542
opt = -1
tmp_port = <optimized out>
bgp_port = 179
addresses = 0x55fdf08072a0
no_fib_flag = <optimized out>
no_zebra_flag = 0
skip_runas = 0
instance = 0
buffer_size = 65536
address = <optimized out>
node = <optimized out>
__func__ = {<optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>}
_xref = {xref = {xrefdata = 0x0, type = XREFT_ASSERT, line = 531, file = 0x55fdef52f0bf "bgpd/bgp_main.c", func = 0x55fdef52f5ef <__func__.16> "main"}, expr = 0x55fdef60b7c7 "node", extra = 0x0, args = 0x0}
xref_p_19 = 0x55fdef64b840 <_xref.20>
Even with no bgp dumpening I still got a crash.
Can we have a coredump when it crashes without BGP dampening enabled as well?
Can we have a coredump when it crashes without BGP dampening enabled as well?
Sure. I'm waiting for it :).
This is the core dump on R02 with frr 8.0-dev:
config-process-for-coredump frr-bgp-debug-core-dump
root@R02:/opt/coredump# ./frr-bgp-debug-core-dump
0x0d7310
[New LWP 36681]
[New LWP 36682]
[New LWP 36747]
[New LWP 36684]
[New LWP 36683]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/bgpd -d -F traditional -A 127.0.0.1 -M rpki'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7fb1f57e3c80 (LWP 36681))]
add symbol table from file "/usr/lib/debug/.build-id/c0/419e45584c90e921b0993ea1a5881140442421.debug" at
.text_addr = 0xd7310
#0 raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:50
set = {__val = {18446744067266829055, 140725959385938, 167, 0, 0, 0, 7018141387277233769, 8246195854090838116, 7021216768532505455, 140402314008215, 140725959385904, 140402314008154, 3611922223501156384, 3834033537019820080, 8315161139553056305, 2914783753315442547}}
pid = <optimized out>
tid = <optimized out>
ret = <optimized out>
#1 0x00007fb1f60fbb4c in core_handler (signo=11, siginfo=0x7ffd50d1e670, context=<optimized out>) at lib/sigevent.c:262
pc = 0x5577830a38f6 <bgp_reuselist_del+54>
sa_default = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x0}
sigset = {__val = {9216, 0 <repeats 15 times>}}
#2 <signal handler called>
No locals.
#3 0x00005577830a38f6 in bgp_reuselist_del (list=0x557784c7ccf8, node=0x7ffd50d1eb10) at bgpd/bgp_damp.c:57
curelm = 0x5577d024c5f0
__func__ = "bgp_reuselist_del"
#4 0x00005577830a3bcb in bgp_reuse_list_delete (bdi=<optimized out>, bdc=<optimized out>, bdc=<optimized out>) at bgpd/bgp_damp.c:186
list = 0x557784c7ccf8
rn = 0x5577d2c39cc0
#5 0x00005577830a42ac in bgp_damp_update (path=path@entry=0x5577d1049850, dest=dest@entry=0x5577b45ffaa0, afi=afi@entry=AFI_IP, safi=safi@entry=SAFI_UNICAST) at bgpd/bgp_damp.c:424
t_now = <optimized out>
bdi = 0x557786483280
status = <optimized out>
bdc = 0x557784af7c80
__func__ = {<optimized out> <repeats 16 times>}
#6 0x0000557783001221 in bgp_update (peer=peer@entry=0x7fb1f29d0010, p=p@entry=0x7ffd50d1eed0, addpath_id=addpath_id@entry=0, attr=0x7ffd50d1efe0, afi=afi@entry=AFI_IP, safi=safi@entry=SAFI_UNICAST, type=<optimized out>, sub_type=<optimized out>, prd=0x0, label=0x0, num_labels=<optimized out>, soft_reconfig=<optimized out>, evpn=0x0) at bgpd/bgp_route.c:4083
ret = <optimized out>
aspath_loop_count = <optimized out>
dest = 0x5577b45ffaa0
bgp = 0x557784af5460
new_attr = {aspath = 0x5577c9987460, community = 0x557785abd810, refcnt = 0, flag = 32919, nexthop = {s_addr = 801695425}, med = 0, local_pref = 350, nh_ifindex = 0, origin = 0 '\000', pmsi_tnl_type = PMSI_TNLTYPE_NO_INFO, rmap_change_flags = 0, mp_nexthop_global = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, mp_nexthop_local = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, nh_lla_ifindex = 0, ecommunity = 0x5577860be380, ipv6_ecommunity = 0x0, lcommunity = 0x0, cluster1 = 0x0, transit = 0x0, mp_nexthop_global_in = {s_addr = 0}, aggregator_addr = {s_addr = 0}, originator_id = {s_addr = 0}, weight = 0, aggregator_as = 0, mp_nexthop_len = 0 '\000', mp_nexthop_prefer_global = 0 '\000', sticky = 0 '\000', default_gw = 0 '\000', router_flag = 0 '\000', es_flags = 0 '\000', tag = 0, label_index = 4294967295, label = 4294836223, srv6_vpn = 0x0, srv6_l3vpn = 0x0, encap_tunneltype = 0, encap_subtlvs = 0x0, vnc_subtlvs = 0x0, evpn_overlay = {gw_ip = {ipv4 = {s_addr = 0}, ipv6 = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}, mm_seqnum = 0, mm_sync_seqnum = 0, rmac = {octet = "\000\000\000\000\000"}, distance = 0 '\000', rmap_table_id = 0, link_bw = 0, esi = {val = "\000\000\000\000\000\000\000\000\000"}, srte_color = 0, df_pref = 0, df_alg = 0 '\000'}
attr_new = 0x5577cb25b250
pi = <optimized out>
new = <optimized out>
extra = <optimized out>
reason = <optimized out>
pfx_buf = "\340\355\321P\375\177\000\000\370\355\321P\375\177\000\000\340\355\321P\375\177\000\000\200\343\v\206wU\000\000\332\320\335\354\261\177\000\000\020\000\000\000\000\000\000\000\300\000\000\000\000\000\000\000\066\231\372\202wU\000\000\020\000\000\000\000\000\000\000\300\000\000\000\000\000\000\000\020\356\321P\375\177\000\000\000=\224+(\r\030\315\300\000\000\000\000\000\000\000\340\357\321P\375\177\000\000\020\000\235\362\261\177\000\000\320\360\321P\375\177\000"
connected = 0
do_loop_check = <optimized out>
has_valid_label = <optimized out>
nh_afi = <optimized out>
pi_type = <optimized out>
pi_sub_type = <optimized out>
vnc_implicit_withdraw = <optimized out>
same_attr = <optimized out>
__func__ = "bgp_update"
pfxprint = {<optimized out> <repeats 80 times>}
label_decoded = <optimized out>
#7 0x0000557783002207 in bgp_nlri_parse_ip (peer=peer@entry=0x7fb1f29d0010, attr=attr@entry=0x7ffd50d1efe0, packet=0x7ffd50d1ef80) at bgpd/bgp_route.c:5312
pnt = 0x7fb1ecddd0e6 "\303\"\024"
lim = 0x7fb1ecddd0e9 ""
p = {family = 2 '\002', prefixlen = 24, u = {prefix = 195 '\303', prefix4 = {s_addr = 1319619}, prefix6 = {__in6_u = {__u6_addr8 = "\303\"\024", '\000' <repeats 12 times>, __u6_addr16 = {8899, 20, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {1319619, 0, 0, 0}}}, lp = {id = {s_addr = 1319619}, adv_router = {s_addr = 0}}, prefix_eth = {octet = "\303\"\024\000\000"}, val = "\303\"\024", '\000' <repeats 12 times>, val32 = {1319619, 0, 0, 0}, ptr = 1319619, prefix_evpn = {route_type = 195 '\303', u = {_ead_addr = {esi = {val = "\000\000\000\000\000\000\000\000\000"}, eth_tag = 0, ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}, _macip_addr = {eth_tag = 0, ip_prefix_length = 0 '\000', mac = {octet = "\000\000\000\000\000"}, ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}, _imet_addr = {eth_tag = 0, ip_prefix_length = 0 '\000', ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}, _es_addr = {esi = {val = "\000\000\000\000\000\000\000\000\000"}, ip_prefix_length = 0 '\000', ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}, _prefix_addr = {eth_tag = 0, ip_prefix_length = 0 '\000', ip = {ipa_type = IPADDR_NONE, ip = {addr = 0 '\000', _v4_addr = {s_addr = 0}, _v6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}}}}, prefix_flowspec = {family = 195 '\303', prefixlen = 20, ptr = 0}}}
psize = <optimized out>
ret = <optimized out>
afi = AFI_IP
safi = SAFI_UNICAST
addpath_encoded = 0
addpath_id = 0
__func__ = {<optimized out> <repeats 18 times>}
#8 0x0000557782fe6583 in bgp_nlri_parse (peer=peer@entry=0x7fb1f29d0010, attr=attr@entry=0x7ffd50d1efe0, packet=packet@entry=0x7ffd50d1ef80, mp_withdraw=mp_withdraw@entry=0) at bgpd/bgp_packet.c:311
No locals.
#9 0x0000557782fe7058 in bgp_update_receive (peer=peer@entry=0x7fb1f29d0010, size=size@entry=86) at bgpd/bgp_packet.c:1720
i = 0
ret = <optimized out>
nlri_ret = <optimized out>
end = <optimized out>
s = <optimized out>
attr = {aspath = 0x5577c9987460, community = 0x5577b751cde0, refcnt = 0, flag = 32903, nexthop = {s_addr = 801695425}, med = 0, local_pref = 0, nh_ifindex = 0, origin = 0 '\000', pmsi_tnl_type = PMSI_TNLTYPE_NO_INFO, rmap_change_flags = 0, mp_nexthop_global = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, mp_nexthop_local = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, nh_lla_ifindex = 0, ecommunity = 0x5577860be380, ipv6_ecommunity = 0x0, lcommunity = 0x0, cluster1 = 0x0, transit = 0x0, mp_nexthop_global_in = {s_addr = 0}, aggregator_addr = {s_addr = 0}, originator_id = {s_addr = 0}, weight = 0, aggregator_as = 0, mp_nexthop_len = 0 '\000', mp_nexthop_prefer_global = 0 '\000', sticky = 0 '\000', default_gw = 0 '\000', router_flag = 0 '\000', es_flags = 0 '\000', tag = 0, label_index = 4294967295, label = 4294836223, srv6_vpn = 0x0, srv6_l3vpn = 0x0, encap_tunneltype = 0, encap_subtlvs = 0x0, vnc_subtlvs = 0x0, evpn_overlay = {gw_ip = {ipv4 = {s_addr = 0}, ipv6 = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}}}, mm_seqnum = 0, mm_sync_seqnum = 0, rmac = {octet = "\000\000\000\000\000"}, distance = 0 '\000', rmap_table_id = 0, link_bw = 0, esi = {val = "\000\000\000\000\000\000\000\000\000"}, srte_color = 0, df_pref = 0, df_alg = 0 '\000'}
attribute_len = <optimized out>
update_len = 4
withdraw_len = 0
restart = false
NLRI_UPDATE = NLRI_UPDATE
NLRI_WITHDRAW = NLRI_WITHDRAW
NLRI_MP_UPDATE = NLRI_MP_UPDATE
NLRI_MP_WITHDRAW = NLRI_MP_WITHDRAW
NLRI_TYPE_MAX = NLRI_TYPE_MAX
nlris = {{afi = 1, safi = 1 '\001', nlri = 0x7fb1ecddd0e5 "\030\303\"\024", length = 4}, {afi = 0, safi = 0 '\000', nlri = 0x0, length = 0}, {afi = 0, safi = 0 '\000', nlri = 0x0, length = 0}, {afi = 0, safi = 0 '\000', nlri = 0x0, length = 0}}
__func__ = "bgp_update_receive"
attr_parse_ret = <optimized out>
#10 0x0000557782fe9dc6 in bgp_process_packet (thread=<optimized out>) at bgpd/bgp_packet.c:2585
type = 2 '\002'
xref_p_100 = 0x5577831894a0 <_xref.132>
size = 86
notify_data_length = {<optimized out>, <optimized out>}
_xrefdata = {xref = 0x5577831894a0 <_xref.132>, uid = "TJQQE-0PPJT\000\000\000\000", hashstr = 0x5577830f4b38 "%s: BGP NOTIFY receipt failed for peer: %s", hashu32 = {3, 33554456}}
_xref = {xref = {xrefdata = 0x5577831dd600 <_xrefdata.123>, type = XREFT_LOGMSG, line = 2598, file = 0x5577830f4fa9 "bgpd/bgp_packet.c", func = 0x5577830f5330 <__func__.134> "bgp_process_packet"}, fmtstring = 0x5577830f4b38 "%s: BGP NOTIFY receipt failed for peer: %s", priority = 3, ec = 33554456, args = 0x5577830d432b "__func__, peer->host"}
_xrefdata = {xref = 0x557783189460 <_xref.131>, uid = "YWXN7-Q2X5C\000\000\000\000", hashstr = 0x5577830f4b68 "%s: BGP KEEPALIVE receipt failed for peer: %s", hashu32 = {3, 33554457}}
_xref = {xref = {xrefdata = 0x5577831dd640 <_xrefdata.124>, type = XREFT_LOGMSG, line = 2610, file = 0x5577830f4fa9 "bgpd/bgp_packet.c", func = 0x5577830f5330 <__func__.134> "bgp_process_packet"}, fmtstring = 0x5577830f4b68 "%s: BGP KEEPALIVE receipt failed for peer: %s", priority = 3, ec = 33554457, args = 0x5577830d432b "__func__, peer->host"}
xref_p_101 = 0x557783189460 <_xref.131>
peer = 0x7fb1f29d0010
rpkt_quanta_old = <optimized out>
fsm_update_result = <optimized out>
mprc = <optimized out>
processed = 1
__func__ = "bgp_process_packet"
#11 0x00007fb1f610c023 in thread_call (thread=thread@entry=0x7ffd50d1f390) at lib/thread.c:1825
realtime = 93971809340488
cputime = 140725959390080
exp = <optimized out>
helper = 140402314231509
before = {cpu = {ru_utime = {tv_sec = 152, tv_usec = 33013}, ru_stime = {tv_sec = 42, tv_usec = 513345}, {ru_maxrss = 1650468, __ru_maxrss_word = 1650468}, {ru_ixrss = 0, __ru_ixrss_word = 0}, {ru_idrss = 0, __ru_idrss_word = 0}, {ru_isrss = 0, __ru_isrss_word = 0}, {ru_minflt = 401925, __ru_minflt_word = 401925}, {ru_majflt = 11, __ru_majflt_word = 11}, {ru_nswap = 0, __ru_nswap_word = 0}, {ru_inblock = 0, __ru_inblock_word = 0}, {ru_oublock = 88, __ru_oublock_word = 88}, {ru_msgsnd = 0, __ru_msgsnd_word = 0}, {ru_msgrcv = 0, __ru_msgrcv_word = 0}, {ru_nsignals = 0, __ru_nsignals_word = 0}, {ru_nvcsw = 305086, __ru_nvcsw_word = 305086}, {ru_nivcsw = 706444, __ru_nivcsw_word = 706444}}, real = {tv_sec = 99046, tv_usec = 465597}}
after = {cpu = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, {ru_maxrss = 0, __ru_maxrss_word = 0}, {ru_ixrss = 0, __ru_ixrss_word = 0}, {ru_idrss = 0, __ru_idrss_word = 0}, {ru_isrss = 0, __ru_isrss_word = 0}, {ru_minflt = 0, __ru_minflt_word = 0}, {ru_majflt = 0, __ru_majflt_word = 0}, {ru_nswap = 1, __ru_nswap_word = 1}, {ru_inblock = 0, __ru_inblock_word = 0}, {ru_oublock = 88, __ru_oublock_word = 88}, {ru_msgsnd = 0, __ru_msgsnd_word = 0}, {ru_msgrcv = 0, __ru_msgrcv_word = 0}, {ru_nsignals = 0, __ru_nsignals_word = 0}, {ru_nvcsw = 305085, __ru_nvcsw_word = 305085}, {ru_nivcsw = 706444, __ru_nivcsw_word = 706444}}, real = {tv_sec = 99046, tv_usec = -3668167430312280832}}
__func__ = {<optimized out> <repeats 12 times>}
#12 0x00007fb1f60cb198 in frr_run (master=0x557784508c30) at lib/libfrr.c:1155
instanceinfo = '\000' <repeats 63 times>
__func__ = "frr_run"
thread = {type = 4 '\004', add_type = 3 '\003', threaditem = {si = {next = 0x0}}, timeritem = {hi = {index = 0}}, ref = 0x7fb1f2a71d50, master = 0x557784508c30, func = 0x557782fe9960 <bgp_process_packet>, arg = 0x7fb1f29d0010, u = {val = 0, fd = 0, sands = {tv_sec = 0, tv_usec = 0}}, real = {tv_sec = 99046, tv_usec = 465597}, hist = 0x7fb1ec0035d0, yield = 10000, xref = 0x557783182fc0 <_xref.15>, mtx = pthread_mutex_t = {Type = Normal, Status = Not acquired, Robust = No, Shared = No, Protocol = None}}
#13 0x0000557782f95136 in main (argc=<optimized out>, argv=<optimized out>) at bgpd/bgp_main.c:532
opt = -1
tmp_port = <optimized out>
bgp_port = 179
addresses = 0x5577845012d0
no_fib_flag = <optimized out>
no_zebra_flag = 0
skip_runas = 0
instance = 0
buffer_size = 65536
address = <optimized out>
node = <optimized out>
__func__ = {<optimized out>, <optimized out>, <optimized out>, <optimized out>, <optimized out>}
_xref = {xref = {xrefdata = 0x0, type = XREFT_ASSERT, line = 521, file = 0x5577830ad0bf "bgpd/bgp_main.c", func = 0x5577830ad5bf <__func__.16> "main"}, expr = 0x557783145aa7 "node", extra = 0x0, args = 0x0}
xref_p_19 = 0x5577831715a0 <_xref.20>
root@R02:/opt/coredump#
This crash is also with BGP dampening enabled. But you said it's crashing even when it's disabled. Or I'm missing something?
This crash is also with BGP dampening enabled. But you said it's crashing even when it's disabled. Or I'm missing something?
Yes. It had bgp dampening enabled. After crash I disabled it. I'm still waiting for R01 to crash with dampening off.
I don't know if that crash it was without bgp dampening off or not, but after I posted the crash log for R01 I disabled the bgp dampening. I'm not sure if it took the command or not, but in the show run
I couldn't see it. Unfortunately I didn't had time to set prlimit for the process to drop the coredump.
Now is running for about 18h. I'm still waiting for it to see if is crashing or not,
R02 is running for about 15h.
I'm trying to replicate the issue, but no joy. Maybe you have a minimal configuration to replicate this crash?
Hi @ton31337,
Unfortunately the daemon didn't crashed with dampening off. Regarding to the setup, I believe you need to get the full routing table. I can try to build a testbed FFR connected to my routers and send full routing table and I will activate dampening on this testbed and maybe I can get the same crash.
I tried to build a FRR testbed. After I activated on testbed bgp damping, R01 crashed. Also the FRR testbed crashed. Both are running latest 8.1.
Any chance I can get this testbed for testing?
I'm trying, but I'm facing another issue: zebra is crashing in latest 8.1-dev :(. After I'm adding this simple config on my testbed:
# cat frr.conf
frr version 8.1-dev
frr defaults traditional
hostname FRR-01
log syslog informational
service integrated-vtysh-config
!
router bgp 65590
bgp router-id 10.180.0.40
neighbor 10.180.0.61 remote-as 43474
neighbor 10.180.0.61 description R01
neighbor 10.180.0.61 graceful-restart
!
address-family ipv4 unicast
neighbor 10.180.0.61 soft-reconfiguration inbound
neighbor 10.180.0.61 route-map rm-PERMIT-ALL in
neighbor 10.180.0.61 route-map rm-PERMIT-ALL out
exit-address-family
!
route-map rm-PERMIT-ALL permit 1000
!
segment-routing
traffic-eng
!
line vty
!
After few seconds zebra is crashing, but without an crash signal:
Jul 16 19:53:33 FRR-01 watchfrr[4129]: [KWE5Q-QNGFC] all daemons up, doing startup-complete notify
Jul 16 19:56:49 FRR-01 zebra[4179]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:56:49 FRR-01 ospfd[4191]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:56:49 FRR-01 ospfd[4194]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:56:49 FRR-01 ospf6d[4197]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:56:49 FRR-01 ldpd[4209]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:56:49 FRR-01 bgpd[4184]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:56:49 FRR-01 isisd[4200]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:56:49 FRR-01 pimd[4203]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:56:49 FRR-01 nhrpd[4229]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:56:49 FRR-01 vrrpd[4246]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:57:11 FRR-01 bgpd[4184]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:57:25 FRR-01 bgpd[4184]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:57:47 FRR-01 bgpd[4184]: [VTVCM-Y2NW3] Configuration Read in Took: 00:00:00
Jul 16 19:58:13 FRR-01 watchfrr[4129]: [WFP93-1D146] configuration write completed with exit code 0
Jul 16 19:58:22 FRR-01 watchfrr[4129]: [WFP93-1D146] configuration write completed with exit code 0
Jul 16 19:58:40 FRR-01 watchfrr[4129]: [HD38Q-0HBRT][EC 268435457] zebra state -> down : read returned EOF
Jul 16 19:58:40 FRR-01 bgpd[4184]: [YAF85-253AP][EC 100663299] buffer_write: write error on fd 15: Broken pipe
Jul 16 19:58:40 FRR-01 bgpd[4184]: [X6B3Y-6W42R][EC 100663302] zclient_send_message: buffer_write failed to zclient fd 15, closing
Jul 16 19:58:41 FRR-01 watchfrr[4129]: [NG1AJ-FP2TQ] Terminating on signal
Jul 16 19:58:41 FRR-01 vrrpd[4246]: [N50WA-0KKX6] Terminating on signal
Jul 16 19:58:41 FRR-01 bgpd[4184]: [ZW1GY-R46JE] Terminating on signal
Jul 16 19:58:41 FRR-01 ospfd[4194]: [W9T04-QWK6B] Terminating on signal
Jul 16 19:58:41 FRR-01 ospfd[4191]: [W9T04-QWK6B] Terminating on signal
Jul 16 19:58:41 FRR-01 pimd[4203]: [J5GFN-WGVKR] Terminating on signal SIGINT
Jul 16 19:58:41 FRR-01 ldpd[4209]: SIGINT received
Jul 16 19:58:41 FRR-01 ldpd[4209]: terminating
Jul 16 19:58:41 FRR-01 pimd[4203]: [TYPP0-VBBYM] pim_if_del_vif: vif_index=0 < 1 on interface pimreg50 ifindex=27
Jul 16 19:58:41 FRR-01 pimd[4203]: [TYPP0-VBBYM] pim_if_del_vif: vif_index=-1 < 1 on interface red ifindex=5
Jul 16 19:58:41 FRR-01 pimd[4203]: [TYPP0-VBBYM] pim_if_del_vif: vif_index=-1 < 1 on interface blue ifindex=6
Jul 16 19:58:41 FRR-01 pimd[4203]: [TYPP0-VBBYM] pim_if_del_vif: vif_index=0 < 1 on interface pimreg55 ifindex=28
Jul 16 19:58:41 FRR-01 pimd[4203]: [TYPP0-VBBYM] pim_if_del_vif: vif_index=-1 < 1 on interface green ifindex=7
Jul 16 19:58:41 FRR-01 pimd[4203]: [TYPP0-VBBYM] pim_if_del_vif: vif_index=0 < 1 on interface pimreg60 ifindex=29
Jul 16 19:58:41 FRR-01 pimd[4203]: [TYPP0-VBBYM] pim_if_del_vif: vif_index=0 < 1 on interface pimreg ifindex=26
Jul 16 19:58:41 FRR-01 isisd[4200]: [ZW9EW-V8QX8] Terminating on signal SIGINT
Jul 16 19:58:41 FRR-01 ospf6d[4197]: [SKCG8-9JAK7] Terminating on signal SIGINT
Jul 16 19:58:43 FRR-01 bgpd[4184]: [WVAM7-7ZYKQ][EC 33554499] sendmsg_nexthop: zclient_send_message() failed
Jul 16 19:58:48 FRR-01 watchfrr[4577]: [T83RR-8SM5G] watchfrr 8.1-dev starting: vty@0
Jul 16 19:58:48 FRR-01 watchfrr[4577]: [ZCJ3S-SPH5S] zebra state -> down : initial connection attempt failed
Jul 16 19:58:48 FRR-01 watchfrr[4577]: [ZCJ3S-SPH5S] bgpd state -> down : initial connection attempt failed
Jul 16 19:58:48 FRR-01 watchfrr[4577]: [ZCJ3S-SPH5S] ospfd-1 state -> down : initial connection attempt failed
Jul 16 19:58:48 FRR-01 watchfrr[4577]: [ZCJ3S-SPH5S] ospfd-2 state -> down : initial connection attempt failed
Jul 16 19:58:48 FRR-01 watchfrr[4577]: [ZCJ3S-SPH5S] ospf6d state -> down : initial connection attempt failed
Jul 16 19:58:48 FRR-01 watchfrr[4577]: [ZCJ3S-SPH5S] isisd state -> down : initial connection attempt failed
Jul 16 19:58:48 FRR-01 watchfrr[4577]: [ZCJ3S-SPH5S] ldpd state -> down : initial connection attempt failed
Jul 16 19:58:48 FRR-01 watchfrr[4577]: [ZCJ3S-SPH5S] pimd state -> down : initial connection attempt failed
This FRR testbed is receiving the full table of IPv4.
Ok. Fixed the issue. Seems that with 4GB of RAM is not enough to keep zebra running with full table :|. I've activated the bgp dampening on this testbed.
OK, I'm able to replicate the crash on my testbed. I will post the coredump today. Do you need the tcpdump?
No, coredump is needed only. But it would be the best if I would have an access to that box.
No, coredump is needed only. But it would be the best if I would have an access to that box.
Sure. Check your email please.
@EasyNetDev would be great if you could check this PR – https://github.com/FRRouting/frr/pull/9215.
Hi,
I notice that my FRR is crashing with these in logs, on both routers:
R01:
R02:
[X] Did you check if this is a duplicate issue? [X] Did you test it on the latest FRRouting/frr master branch?
Versions
This is the version of FRR: