Open sandeep-dutta opened 5 years ago
Attached is the journalctl output for i-32 for last 5 mins. journalctl_i32.txt
It's well known that if we go over the tcam limit that vnet will call panic and crash. Did you see a vnet stack trace in /var/log/syslog to see if this is the know panic? Of course we need to do something better than crash at too many tcam entries, but I don't think that's in place yet.
We have not seen any panic trace under syslog for this issue. In this case we have not gone over the tcam limit. We have used 16035 entries to store these routes under tcam.
sandeep@invader32:~$ ip route |wc -l 16035
Couple of inferences based on TH spec and the current goes driver support for L3DEFIP (TCAM) table:
questions for SQA team: please share the sequence, type of entries wiith prefix lengths in the test case
Probable next steps for dev team:
Hi Govind, Please find the attached 16k interface file which contains 16K routes. The prefix length for all static route entries is /32. 16k_static_route_interfaces.txt
The issue is again reproducible on i-32. Attaching logs captured by show_tech.py script.
Current goes version running on i-32 root@invader32:/tmp/log# goes version v1.2.0-rc.1
root@invader32:/tmp/log# dpkg --list |grep kernel 4.13.0-170-ga4eca81e3486 20190129_014719_069.zip
Govind is working on this (unable to update assignee)
Quick update on debugging:
Inferences:
TBD:
Hi Govind,
While executing regression with following GoES & kernel version we noticed that vnetd service for i-30 (172.17.2.30) was found not OK.
root@invader30:/tmp/log# goes version v1.2.0-rc0 root@invader30:/tmp/log# dpkg --list |grep kernel ii kmod 18-3 amd64 tools for managing Linux kernel modules ii libdrm2:amd64 2.4.58-2 amd64 Userspace interface to kernel DRM services -- runtime ii linux-image-4.13.0-platina-mk1 4.13.0-178-g13e3790c8eac amd64 Linux kernel, version 4.13.0-platina-mk1 ii rsyslog 8.4.2-1+deb8u2 amd64 reliable system and kernel logging daemon
Mode - XETH PCI - OK Check daemons - OK Check Redis - OK Check vnet - Not OK status: vnetd daemon not responding
The steps that caused the failure was during bringing down & up the interfaces along with goes restart after execution of 16K static route test case
ifdown -a --allow vnet ifup -a --allow vnet goes restart
I manually tried calling the cmds but vnetd failed to come up.
Test case steps
Copy the custom interface file with 16K static routes under /etc/network/interfaces
Execute the following cmds to bring down & up the interfaces along with goes restart ifdown -a --allow vnet ifup -a --allow vnet goes restart
Validate if 16K routes are published under linux (ip route sh) & frr (show ip route)
Once the results are validated replace 16K custom interface file with default interface file
Execute the following cmds to bring down & up the interfaces along with goes restart ifdown -a --allow vnet ifup -a --allow vnet goes restart
Vnted fail to come up on i-30
However the issue did not occurred on any of the other invaders of setup-1 (i-29, i31 & i-32). I have not rebooted the invader & kept as it is. You could take a look at it. I will try to see if I can reproduce this on any other invader.
Please find the attached logs generated via show_tech.py script. 20190208_053642_837.zip
Goes Version root@invader29:/home/sandeep# goes vnetd -version fe1: v1.1.3 fe1a: v1.1.0 vnet-platina-mk1: v1.0.0
Goes build checksum- 5046b7c2cdea8604d331dd7e5dd2fb9c85fa21ff
Kernel version root@invader29:/home/sandeep# dpkg --list |grep kernel ii linux-image-4.13-platina-mk1 4.13-165-gbf3b5fef4591 amd64 Linux kernel, version 4.13-platina-mk1
Noticed that when we add 16K static routes on invader-32 (172.17.2.32) & restart goes after that, vnted service is failing to come up. However this issue has been observed only on this invader. The other invaders participating in regression have vnetd up & running after adding 16k routes & restarting goes.
Steps to reproduce
root@invader32:/home/sandeep# cp 16k_static_route_interfaces /etc/network/interfaces
Execute the following cmds which will bring down & up the interfaces & restart goes services ifdown -a --allow vnet ifup -a --allow vnet goes restart
Noticed that vnetd status fails to come up.
root@invader32:/home/sandeep# goes status GOES status
Mode - XETH PCI - OK Check daemons - OK Check Redis - OK Check vnet - Not OK status: vnetd daemon not responding