Open vaibhavhd opened 2 years ago
The EVPN-VXLAN feature seems to be interfering with CPA feature. This is causing a warmboot problem with CPA @yxieca to start thread with @adyeung on this issue.
@skbhava from BRCM will followup
@vaibhavhd can you please share techsupport for this issue to debug further the reason for tunnel mapping failure at OA
@vaibhavhd can you please share techsupport for this issue to debug further the reason for tunnel mapping failure at OA
This is tech support file: sonic_dump_str2-7050cx3-acs-02_20221208_184456.tar.gz
Issue snippet from the logs to show you where the issue happened:
Dec 8 18:25:22.412756 str2-7050cx3-acs-02 NOTICE admin: Setting up control plane assistant: 10.64.246.125 ...
Dec 8 18:25:22.768842 str2-7050cx3-acs-02 NOTICE swss#vxlanmgrd: :- doVxlanTunnelCreateTask: Create vxlan tunnel neigh_adv
Dec 8 18:25:22.769421 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- addOperation: Vxlan tunnel 'neigh_adv' was added
Dec 8 18:25:22.777758 str2-7050cx3-acs-02 WARNING swss#orchagent: :- createTunnelHw: creation src = 0
Dec 8 18:25:22.778243 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- create_tunnel: create_tunnel:encapmaplist[0]=0x29000000000710
Dec 8 18:25:22.778243 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- create_tunnel: create_tunnel:encapmaplist[1]=0x29000000000712
Dec 8 18:25:22.779032 str2-7050cx3-acs-02 INFO syncd#syncd: [none] SAI_API_TUNNEL:brcm_sai_tnl_mp_create_tunnel:3485 Setting peer_mode to 1
Dec 8 18:25:22.786426 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- addOperation: Vxlan tunnel map entry 'map_1' for tunnel 'neigh_adv' was created
Dec 8 18:25:22.795888 str2-7050cx3-acs-02 ERR swss#orchagent: :- handleSaiGetStatus: Encountered failure in get operation, SAI API: SAI_API_MIRROR, status: SAI_STATUS_INVALID_PARAMETER
Dec 8 18:25:22.804808 str2-7050cx3-acs-02 INFO systemd-udevd[25462]: Using default interface naming scheme 'v247'.
Dec 8 18:25:22.805009 str2-7050cx3-acs-02 INFO systemd-udevd[25462]: ethtool: autonegotiation is unset or enabled, the speed and duplex are not writable.
Dec 8 18:25:22.809420 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- attach: Attached next hop observer of route 192.168.8.0/25 for destination IP 192.168.8.1
Dec 8 18:25:22.811321 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- updateNextHop: Updating mirror session neighbor_advertiser with route 192.168.8.0/25
Dec 8 18:25:22.812640 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- updateNextHop: next hop IPs: 10.0.0.57@PortChannel101,10.0.0.59@PortChannel102,10.0.0.61@PortChannel103,10.0.0.63@PortChannel104
Dec 8 18:25:22.814270 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- updateNextHop: Updated mirror session state db neighbor_advertiser nexthop to 10.0.0.57@PortChannel101
Dec 8 18:25:22.814411 str2-7050cx3-acs-02 INFO kernel: [ 1768.652470] Bridge: port 26(neigh_adv-1000) entered blocking state
Dec 8 18:25:22.814423 str2-7050cx3-acs-02 INFO kernel: [ 1768.652478] Bridge: port 26(neigh_adv-1000) entered disabled state
Dec 8 18:25:22.816209 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- getNeighborInfo: Mirror session neighbor_advertiser neighbor is PortChannel101
Dec 8 18:25:22.818334 str2-7050cx3-acs-02 INFO kernel: [ 1768.656777] device neigh_adv-1000 entered promiscuous mode
Dec 8 18:25:22.831113 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- activateSession: Activated mirror session neighbor_advertiser
Dec 8 18:25:22.832194 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- createEntry: Created mirror session neighbor_advertiser
Dec 8 18:25:22.835432 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:22.835687 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:22.837280 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:22.837280 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:22.846406 str2-7050cx3-acs-02 INFO kernel: [ 1768.685277] Bridge: port 26(neigh_adv-1000) entered blocking state
Dec 8 18:25:22.846442 str2-7050cx3-acs-02 INFO kernel: [ 1768.685285] Bridge: port 26(neigh_adv-1000) entered forwarding state
Dec 8 18:25:22.968908 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:22.968908 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:22.968908 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:22.968960 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:22.971045 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:22.971045 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:23.182183 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:23.182183 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:24.006192 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:24.006192 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:24.006324 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:24.006428 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:24.182317 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:24.182317 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:25.068175 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:25.068175 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:25.068225 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:25.068285 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:25.182263 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:25.182263 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:25.796644 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- add: Successfully created ACL rule rule_arp in table EVERFLOW
Dec 8 18:25:25.798483 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:25.798483 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:25.822953 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- add: Successfully created ACL rule rule_nd in table EVERFLOW
Dec 8 18:25:25.863219 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:25.863219 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:25.868860 str2-7050cx3-acs-02 NOTICE admin: Pausing orchagent ...
Dec 8 18:25:25.975167 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:25.975167 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:25.975206 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:25.975232 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:26.048098 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: Wait time for response from orchagent set to 2000 milliseconds
Dec 8 18:25:26.048098 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: Number of retries for the request to orchagent is set to 5
Dec 8 18:25:26.048796 str2-7050cx3-acs-02 INFO swss#orchagent_restart_check: :- subscribe: subscribed to RESTARTCHECKREPLY
Dec 8 18:25:26.048796 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: requested orchagent to do warm restart state check, retry count: 0
Dec 8 18:25:26.049122 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: RESTARTCHECK notification for orchagent
Dec 8 18:25:26.049122 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: orchagent|NoFreeze:false|SkipPendingTaskCheck:false
Dec 8 18:25:26.049331 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:26.049356 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:26.049356 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: WarmRestart check found pending tasks:
Dec 8 18:25:26.049396 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: VXLAN_REMOTE_VNI_TABLE:Vlan1000:192.168.8.1|SET|vni:1000
Dec 8 18:25:26.049424 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: Restart check result: NOT_READY
Dec 8 18:25:26.049514 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: RESTARTCHECK failed, orchagent is not ready for warm restart with status NOT_READY
Dec 8 18:25:26.049552 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: requested orchagent to do warm restart state check, retry count: 1
Dec 8 18:25:26.049703 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: RESTARTCHECK notification for orchagent
Dec 8 18:25:26.049703 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: orchagent|NoFreeze:false|SkipPendingTaskCheck:false
Dec 8 18:25:26.050026 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:26.050026 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:26.050026 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: WarmRestart check found pending tasks:
Dec 8 18:25:26.050026 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: VXLAN_REMOTE_VNI_TABLE:Vlan1000:192.168.8.1|SET|vni:1000
Dec 8 18:25:26.050026 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: Restart check result: NOT_READY
Dec 8 18:25:26.050026 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: RESTARTCHECK failed, orchagent is not ready for warm restart with status NOT_READY
Dec 8 18:25:26.050026 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: requested orchagent to do warm restart state check, retry count: 2
Dec 8 18:25:26.050156 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: RESTARTCHECK notification for orchagent
Dec 8 18:25:26.050191 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: orchagent|NoFreeze:false|SkipPendingTaskCheck:false
Dec 8 18:25:26.050268 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:26.050406 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:26.050406 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: WarmRestart check found pending tasks:
Dec 8 18:25:26.050439 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: VXLAN_REMOTE_VNI_TABLE:Vlan1000:192.168.8.1|SET|vni:1000
Dec 8 18:25:26.050439 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: Restart check result: NOT_READY
Dec 8 18:25:26.050506 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: RESTARTCHECK failed, orchagent is not ready for warm restart with status NOT_READY
Dec 8 18:25:26.050564 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: requested orchagent to do warm restart state check, retry count: 3
Dec 8 18:25:26.050873 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: RESTARTCHECK notification for orchagent
Dec 8 18:25:26.050873 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: orchagent|NoFreeze:false|SkipPendingTaskCheck:false
Dec 8 18:25:26.050906 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:26.050906 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:26.050944 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: WarmRestart check found pending tasks:
Dec 8 18:25:26.050944 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: VXLAN_REMOTE_VNI_TABLE:Vlan1000:192.168.8.1|SET|vni:1000
Dec 8 18:25:26.050981 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: Restart check result: NOT_READY
Dec 8 18:25:26.051008 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: RESTARTCHECK failed, orchagent is not ready for warm restart with status NOT_READY
Dec 8 18:25:26.051058 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: requested orchagent to do warm restart state check, retry count: 4
Dec 8 18:25:26.051377 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: RESTARTCHECK notification for orchagent
Dec 8 18:25:26.051377 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: orchagent|NoFreeze:false|SkipPendingTaskCheck:false
Dec 8 18:25:26.051409 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:26.051409 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:26.051450 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: WarmRestart check found pending tasks:
Dec 8 18:25:26.051500 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: VXLAN_REMOTE_VNI_TABLE:Vlan1000:192.168.8.1|SET|vni:1000
Dec 8 18:25:26.051548 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: Restart check result: NOT_READY
Dec 8 18:25:26.051607 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: RESTARTCHECK failed, orchagent is not ready for warm restart with status NOT_READY
Dec 8 18:25:26.051656 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: requested orchagent to do warm restart state check, retry count: 5
Dec 8 18:25:26.051718 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: RESTARTCHECK notification for orchagent
Dec 8 18:25:26.051836 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- doTask: orchagent|NoFreeze:false|SkipPendingTaskCheck:false
Dec 8 18:25:26.051871 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1
Dec 8 18:25:26.052170 str2-7050cx3-acs-02 WARNING swss#orchagent: :- addOperation: Vxlan tunnelPort doesn't exist: 192.168.8.1
Dec 8 18:25:26.052170 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: WarmRestart check found pending tasks:
Dec 8 18:25:26.052170 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: VXLAN_REMOTE_VNI_TABLE:Vlan1000:192.168.8.1|SET|vni:1000
Dec 8 18:25:26.052219 str2-7050cx3-acs-02 NOTICE swss#orchagent: :- warmRestartCheck: Restart check result: NOT_READY
Dec 8 18:25:26.052439 str2-7050cx3-acs-02 NOTICE swss#orchagent_restart_check: :- main: RESTARTCHECK failed, orchagent is not ready for warm restart with status NOT_READY
Dec 8 18:25:26.064588 str2-7050cx3-acs-02 NOTICE admin: warm-reboot failure (10) cleanup ...
Dec 8 18:25:26.075823 str2-7050cx3-acs-02 NOTICE admin: Tearing down control plane assistant: 10.64.246.125 ...
nt: :- addTunnelUser: Unable to find EVPN VTEP. user=0 remote_vtep=192.168.8.1 Dec 8 18:25:26.052170 str2-7050cx3-acs-02 WARNING swss#orchage
Thanks. Will analyze the techsupport and get back on this.
@skbhava is there an update? Do you know if this is a SAI issue?
@vaibhavhd Thanks for sharing the tech support. This would not be a SAI issue. From the logs, it looks like the nvo object doesnt seems to be created in the orch-agent which results the remote vni-vlan map addition failing. But at this point, not sure whether the problem is at the vxlanmgr or vxlanorch as the vxlan config seems to be removed at the time of tech support collection and unable to confirm whether the nvo objects exists in config db/app db during problematic state. Are you able to consistently reproduce the issue in your local setup. Can you please share the config you are using to recreate this issue. Also, if the issue consistently seen, is it possible to collect the tech support before removing the vxlan configs and share the same.
@vaibhavhd was VXLAN_EVPN_NVO present in the CONFIG_DB ?
@prsunny to follow up on providing inputs from MSFT
Description
When warmboot is issue with CPA,
RESTARTCHECK
fails with OA continuously being busy on a pending taskVXLAN_REMOTE_VNI_TABLE:Vlan1000:192.168.8.1|SET|vni:1000
This issue has surfaced after a recent PR to fix another issue with interface creation during CPA: https://github.com/sonic-net/sonic-utilities/pull/2398
Steps to reproduce the issue:
-c
option (with CPA)Describe the results you received:
Detailed logs:
Describe the results you expected:
CPA creation should succeed and warmboot should proceed.
Output of
show version
:Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):