Open kannankvs opened 4 years ago
Fixed as part of https://github.com/Azure/sonic-buildimage/pull/3763 Please use the latest VS image
@prsunny : Before raising this gitissue, we had integrated this PR in our code and found that the crash exists even with this fix. Once if we see a successful Jenkins image, we will retest it using Jenkins image and update you.
Zebra crash is seen in #136 as well. root@sonic-z9264f-03:~# show mgmt
ManagementVRF : Enabled
Management VRF interfaces in Linux:
137: mgmt: <NOARP,MASTER,UP,LOWER_UP> mtu 65536 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether ba:b0:2d:ed:d8:32 brd ff:ff:ff:ff:ff:ff promiscuity 0
vrf table 5000 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master mgmt state UP mode DEFAULT group default qlen 1000
link/ether 20:04:0f:3e:ff:49 brd ff:ff:ff:ff:ff:ff
139: lo-m: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master mgmt state UNKNOWN mode DEFAULT group default qlen 1000
link/ether de:23:bd:5c:53:6f brd ff:ff:ff:ff:ff:ff
root@sonic-z9264f-03:~#
root@sonic-z9264f-03:~# ls -l /var/core
total 0
root@sonic-z9264f-03:~#
root@sonic-z9264f-03:~#
root@sonic-z9264f-03:~# config vrf del mgmt
root@sonic-z9264f-03:~#
root@sonic-z9264f-03:~#
root@sonic-z9264f-03:/var/core# ls -l
total 3376
-rw-rw-rw- 1 root root 3454852 Nov 25 07:25 zebra.1574666723.39.core.gz
root@sonic-z9264f-03:/var/core#
root@sonic-z9264f-03:~# show ver
SONiC Software Version: SONiC.HEAD.136-dirty-20191124.063021
Distribution: Debian 9.11
Kernel: 4.9.0-9-2-amd64
Build commit: 45e13b99
Build date: Sun Nov 24 06:44:25 UTC 2019
Built by: johnar@jenkins-worker-11
Platform: x86_64-dellemc_z9264f_c3538-r0
HwSKU: DellEMC-Z9264f-C8D112
ASIC: broadcom
Serial Number: CN030K8MDND0093E0004
Uptime: 07:22:43 up 2 min, 1 user, load average: 2.76, 1.91, 0.79
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-brcm HEAD.136-dirty-20191124.063021 7b8636f11261 392MB
docker-syncd-brcm latest 7b8636f11261 392MB
docker-platform-monitor HEAD.136-dirty-20191124.063021 e8677506e2e9 329MB
docker-platform-monitor latest e8677506e2e9 329MB
docker-fpm-frr HEAD.136-dirty-20191124.063021 29cf3340ac1e 321MB
docker-fpm-frr latest 29cf3340ac1e 321MB
docker-lldp-sv2 HEAD.136-dirty-20191124.063021 4858cbda2d0a 299MB
docker-lldp-sv2 latest 4858cbda2d0a 299MB
docker-dhcp-relay HEAD.136-dirty-20191124.063021 71d699dde1c0 289MB
docker-dhcp-relay latest 71d699dde1c0 289MB
docker-database HEAD.136-dirty-20191124.063021 9ab3fbf18464 281MB
docker-database latest 9ab3fbf18464 281MB
docker-sflow HEAD.136-dirty-20191124.063021 59426ace2a6a 305MB
docker-sflow latest 59426ace2a6a 305MB
docker-teamd HEAD.136-dirty-20191124.063021 848067f09a7d 304MB
docker-teamd latest 848067f09a7d 304MB
docker-snmp-sv2 HEAD.136-dirty-20191124.063021 a20666e82d68 335MB
docker-snmp-sv2 latest a20666e82d68 335MB
docker-orchagent HEAD.136-dirty-20191124.063021 f8ae3d721c51 322MB
docker-orchagent latest f8ae3d721c51 322MB
docker-sonic-telemetry HEAD.136-dirty-20191124.063021 e5cca007df02 304MB
docker-sonic-telemetry latest e5cca007df02 304MB
docker-router-advertiser HEAD.136-dirty-20191124.063021 d774696476f2 281MB
docker-router-advertiser latest d774696476f2 281MB
When VRF is getting deleted, kernel sends a netlink for "vrf deletion" and it sends one other netlink for "link deletion" (the link whose name is same as vrf name). Looks like the zebra is not handling the deletion of vrf followed by deletion of the link.
@tylerlinp , can you please check this?
We reproduced it. It still came from the frr issue 5369, which tylerlinp brought out. The root cause is when vrf removed, In zebra, the remained routes in this vrf route table is freed without removing from the fpm_q. When we remove a vrf, kernel remove all routes about it, but only IPV6 routes send out DELROUTE message. By the way, after kernel 4.11, there is a "skip" option to do not send out DELROUTE as IPV4 routes. Our modification in #3763 can only make this DELROUTE work correctly. But in your case, a 0.0.0.0/0 route is created by kernel(from the frr‘s point), so no DELROUTE message here. Someone is working on this bug. It may be a big change, I think we should wait for frr's workaround. Maybe 2 way for us to avoid this problem temporarily.
Yes, crash is because issue frr#5369(bug 1). But it is different #3763 that only fixed delroute failure(bug 2). There should be bug 3 in frr7.2, that is when downing interface the routes pointing to it didn't clear(I think frr7.1 maybe do right), such as default route here existing until vrf mgmt removed.
I found 2 other issues when reading scripts. a) in interfaces.j2, lo-m
create/remove in lo
up/down. I think it is wrong, it should do in mgmt
and should do set master first to avoid vrf moving. b) in interfaces-config.sh, down eth0
and lo
. I think it is useless and that has been done in systemctl restart networking
.
@ryan44guo , @prsunny : Regarding temporary workaround/solution # 1 to remove this route from kernel manually, I am not sure what is missing. As per the current sequence of operations, following things are already happening. 1) Before mvrf is created, 0.0.0.0/0 is getting deleted from the routing table "default" (this is done using "ifdown --force eth0" that happens from interfaces-config.sh). It is getting added to the table "5000" after networking service restarts. 2) Similarly, before mvrf is deleted, 0.0.0.0/0 is getting deleted from table "5000". It is getting added to table "default" after networking service restarts. Let me know if any other information is required on this.
@tylerlinp : W.r.t. crash, can we assume that the crash will be fixed irrespective of the 2 other issues that you mentioned?
Regarding the 2 other issues, please see my inline replies.
a) lo-m create/remove in lo up/down. I think it is wrong, it should do in mgmt and should do set master first to avoid vrf moving.
<
b) in interfaces-config.sh, down eth0 and lo. I think it is useless and that has been done in systemctl restart networking.
<
@kannankvs Yes, the crash will be fixed irrespective of the 2 other issues that I mentioned. To avoid crash, we should assure to delete routes first then to down eth0. In interfaces.j2, change down
to pre-down
would work.
# management port down rules
pre-down ip {{ '-4' if prefix | ipv4 else '-6' }} route delete default via {{ MGMT_INTERFACE[(name, prefix)]['gwaddr'] }} dev eth0 table {{ vrf_table }}
pre-down ip {{ '-4' if prefix | ipv4 else '-6' }} route delete {{ prefix | network }}/{{ prefix | prefixlen }} dev eth0 table {{ vrf_table }}
pre-down ip {{ '-4' if prefix | ipv4 else '-6' }} rule delete from {{ prefix | ip }}/{{ '32' if prefix | ipv4 else '128' }} table {{ vrf_table }}
{% if (MGMT_VRF_CONFIG) and (MGMT_VRF_CONFIG['vrf_global']['mgmtVrfEnabled'] == "true") %}
down cgdelete -g l3mdev:mgmt
{% endif %}
{% for route in MGMT_INTERFACE[(name, prefix)]['forced_mgmt_routes'] %}
pre-down ip rule delete to {{ route }} table {{ vrf_table }}
{% endfor %}
others:
a) At least set master before add ip, or else lo-m add a global ip then move vrf.
up ip link add lo-m type dummy
up ip link set dev lo-m master mgmt
up ip addr add 127.0.0.1/8 dev lo-m
up ip link set lo-m up
I think lo-m should create with mgmt rather than lo up, lo-m and lo has no relation.
b) You are right. It maybe useful to ifdown eth0 first.
c) What is the meaning to add a directly connected route in interfaces.j2?
up ip {{ '-4' if prefix | ipv4 else '-6' }} route add {{ prefix | network }}/{{ prefix | prefixlen }} dev eth0 table {{ vrf_table }}
@tylerlinp : Couple of doubts. 1) w.r.t "pre-down" suggestion, do we need to do this if the zebra crash is fixed? I think that zebra should work even without this "pre-down" change and hence this change is not a mandatory change. If the fix for zebra crash is going to take longer time, we can do this "pre-down" change as a temporary workaround. Reason for my point is that "down" was the solution that already existing even before mvrf got implemented; it means that mvrf implementation is following the solution that already existed. If we need to make this "pre-down" change, we may need to get the approval from the original author of that code as well. What do you say? 2) w.r.t your suggestion about lo-m, is there an issue due to the order in which "master" is being set? With the existing order, what issue are you observing? Is the existing sequence creating any issue in the zebra? In principle, any user can execute these linux commands in any order and the applications need to handle them. If there are any issues with the existing sequence, we shall go ahead and change the sequence like you suggest. What do you say? 3) Regarding moving "lo-m" under "mgmt", do you see any issue in keeping it under "lo"?
NOTE: Now that the 201911 branch has been pulled out, let us try to do the changes that are mandatory for the release. If you think that any of the above changes are mandatory, request you to provide more reasons for the change which can be produced while raising PR.
@tylerlinp : regarding the "connected" routes, we followed the code that already existed before mvrf implementation. It was doing "ip -4 route add 100.104.47.0/24 dev eth0 table default", where it was adding this connected route in table "default". Kernel was already adding this route in table "main" (and "local"). But, this command is adding the route in another routing table "default" and it adds an ip rule in the end to lookup this table. I am not sure about the original reason behind this. We followed the existing code for management VRF also. May be @prsunny might be able to get an answer for this.
@tylerlinp : Couple of doubts.
- w.r.t "pre-down" suggestion, do we need to do this if the zebra crash is fixed? I think that zebra should work even without this "pre-down" change and hence this change is not a mandatory change. If the fix for zebra crash is going to take longer time, we can do this "pre-down" change as a temporary workaround. Reason for my point is that "down" was the solution that already existing even before mvrf got implemented; it means that mvrf implementation is following the solution that already existed. If we need to make this "pre-down" change, we may need to get the approval from the original author of that code as well. What do you say?
- w.r.t your suggestion about lo-m, is there an issue due to the order in which "master" is being set? With the existing order, what issue are you observing? Is the existing sequence creating any issue in the zebra? In principle, any user can execute these linux commands in any order and the applications need to handle them. If there are any issues with the existing sequence, we shall go ahead and change the sequence like you suggest. What do you say?
- Regarding moving "lo-m" under "mgmt", do you see any issue in keeping it under "lo"?
NOTE: Now that the 201911 branch has been pulled out, let us try to do the changes that are mandatory for the release. If you think that any of the above changes are mandatory, request you to provide more reasons for the change which can be produced while raising PR.
@kannankvs
@tylerlinp :
@kannankvs logically, pre-down and down both should be right, but that's only logically. In my opinion, linux kernel shall send out DELROUTE to application which registered netlink "route", cause it had send out ADDROUTE for it. But they said that's their design to reduce the number of messages, so the applications must register "link" when they only want the "route". You can consider that frr should do it in "link" messages, but frr make some bugs about it. In fact, if we do the remove route stuff in "down" hook, it do nothing, cause linux kernel had done it. So we can consider the remove route stuff (in script) is for the DELROUTE messages, that is helpful for the applications which do not register "link" or not do the "link" messages well (as frr). So we must do it in pre-down hook, that is the only way to send out DELROUTE messages for IPV4 route. If we consider that is kernel's bug(although they do not accept), that is a workaround about it. But only the "pre-down" is, the "down" can do nothing.
@ryan44guo : Got it. As mentioned earlier, we will go ahead and make the changes. Just to understand it better, if we have route delete command in "down", is it not getting executed?
@kannankvs : I am not very sure about the "down" hook call is before or after the down command. So I tested it, it had not send out DELROUTE messages. So I think the "down" hook is called after the "down" command, but before post-down. Of course, it is executed, but for these route stuff, the routes had been deleted by kernel, so no action really do in these nonexistence routes deletion actions.
@ryan44guo , @tylerlinp, @prsunny : I changed the interfaces.j2 and raised the PR3853. Request you to kindly review and approve. I am assuming that we shall keep this gitissue open until the FRR code is fixed. If you know who might be fixing it, request you to kindly tag them here.
$ vtysh
Hello, this is FRRouting (version 7.2-sonic).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
# show ip route
zebra is not running
Closing as #3853 is merged
@wendani , is it a new issue? if so, can you raise an issue
I'll make this open to track the proper fix of the issue.
@prsunny will sync up with @pavel-shirshov
was mentioned as seen in latest 201911
I checked this. I found the issue with zebra on master. I see that zebra crashes when we issue /sbin/reboot. I asked @tahmed-dev to investigate further.
@kannankvs, is this issue still being seen? I just used the tip off master on my DUT and could not produces it by either method 1 mentioned in the issue details nor by issuing /sbin/reboot:
admin@str-s6000-acs-14:~$ sudo config vrf add mgmt
admin@str-s6000-acs-14:~$ sudo config vrf del mgmt
admin@str-s6000-acs-14:~$ ls -alrt /var/core
total 4
drwxr-xr-x 2 root root 3 Jun 9 16:23 .
drwxr-xr-x 1 root root 4096 Jun 9 23:58 ..
admin@str-s6000-acs-14:~$ show version
SONiC Software Version: SONiC.master.0-7525fea6
Distribution: Debian 10.4
Kernel: 4.19.0-6-amd64
Build commit: 7525fea6
Build date: Tue Jun 9 16:25:44 UTC 2020
Built by: taahme@taahme-vm0
Description
Zebra crash is seen upon deleting mvrf. From the GDB traces, the route node does not have valid data and the zebra process is crashed while trying to access NULL/Invalid rnode->table pointer. Various methods have been experimented to recreate the issue using linux commands in a script (without using management vrf commands) to isolate the problem. From those methods, it is clear that crash is observed with simple linux scripts as well.
Create VRF - Creation cgroup , eth0 ,mgmt vrf link
Linux commands (ran in script): cgcreate -g l3mdev:mgmt cgset -r l3mdev.master-device=mgmt mgmt ip link add dev mgmt type vrf table 5000 ip link set mgmt up ip link set dev eth0 master mgmt
Delete VRF - Delete eth0,mgmt vrf link
ip link set dev eth0 nomaster ip link delete dev mgmt
Steps to reproduce the issue:
Method1 config vrf add mgmt
config vrf del mgmt root@sonic-z9264f-03:/var/core# ls -l -rw-rw-rw- 1 root root 3509004 Nov 3 12:41 zebra.1572784872.36.core.gz
Method2 As explained earlier, using those linux commands in two scripts. Script1 for creating mvrf and Script2 for deleting mvrf.
Describe the results you received:
On debugging, got the following callback trace in zebra where rn->table is NULL/Invalid. GDB back trace:
0 0x00007f5f7062bfff in raise () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7f5f71a7ed40 (LWP 43))] (gdb) bt
0 0x00007f5f7062bfff in raise () from /lib/x86_64-linux-gnu/libc.so.6
1 0x00007f5f7062d42a in abort () from /lib/x86_64-linux-gnu/libc.so.6
2 0x00007f5f7167939f in core_handler (signo=11, siginfo=0x7ffd2896ceb0,
3
4 srcdest_rnode_table (rn=) at ./lib/srcdest_table.h:90
5 rib_dest_table (dest=0x557be2369110) at zebra/rib.h:471
6 rib_dest_vrf (dest=0x557be2369110) at zebra/rib.h:479
7 netlink_route_info_fill (re=0x0, dest=0x557be2369110, cmd=25,
8 zfpm_netlink_encode_route (cmd=25, dest=dest@entry=0x557be2369110,
9 0x00007f5f6e5ad8ed in zfpm_encode_route (msg_type=,
10 zfpm_build_route_updates () at zebra/zebra_fpm.c:990
11 zfpm_build_updates () at zebra/zebra_fpm.c:1149
12 zfpm_write_cb (thread=0x7ffd289714c0) at zebra/zebra_fpm.c:1187
13 0x00007f5f71686a50 in thread_call (thread=thread@entry=0x7ffd289714c0)
14 0x00007f5f71656fa8 in frr_run (master=0x557be212c9c0) at lib/libfrr.c:1054
15 0x0000557be038c9b3 in main (argc=9, argv=0x7ffd289718c8)
Describe the results you expected: Expected zebra not to crash.
Additional information you deem important (e.g. issue happens only occasionally):
When the linux script is not giving the crash, few additional linux commands in create and delete will help sometimes. Additional Commands in Create script: cgcreate -g l3mdev:mgmt cgset -r l3mdev.master-device=mgmt mgmt
Additional Commands in Delete script: cgdelete -g l3mdev:mgmt
Any master image.