sonic-net / SONiC

Landing page for Software for Open Networking in the Cloud (SONiC) - https://sonic-net.github.io/SONiC/
2.22k stars 1.12k forks source link

[Subport] On SONiC-201911 branch Parent interface state change is not linked with child interface #601

Open Quinnwangs opened 4 years ago

Quinnwangs commented 4 years ago

[version] SONiC Software Version: SONiC.nephos-201911-69879e4-202002261959.0-dirty-20200307.002748 Distribution: Debian 9.12 Kernel: 4.9.0-9-2-amd64 Build commit: 7c915950 Build date: Sat Mar 7 08:47:57 UTC 2020 [topo] taurus----hub----tp1 |----tp2

[configuration]

  1. creat subport Ethernet124.10 on Etherent124
  2. creat subport Ethernet124.20 on Etherent124

[operation & rslt]

  1. load the cfg as above
  2. tp1<--> tp2 send traffic, both can receive traffic from opposite
  3. shutdown subport Ethernet124.10
  4. shutdown parent port Etherent124
  5. startup parent port Ethernet124 after step 5 will find traffic is abnormal, tp1 can receive traffic from tp2, tp2 can not receive any traffic. according to the config, both tp1 and tp2 should not receive any traffic from opposite.

conclusion:When the state of the parent interface changes from up to down,subport status not change to down;When parent interface status from down to up ,Sonic will add all routes on this interface,include subports route,but this time subport status is down

Signed-off-by: Quinn Quinn.Wang@nephosinc.com

wendani commented 3 years ago

The design is to have a separate admin status control from parent interface from the SAI/ASIC perspective.

Tried ip link set Ethernet64 to observe the kernel side behavior. What we find is that

1) sub interface admin state syncs with parent interface admin state change---set Ethernet64 from up to down will admin down Ethernet64.10 and set Ethernet64 from down to up will admin up Ethernet64.10.

2) if parent interface is admin down, sub interface cannot be set admin up with error RTNETLINK answers: Network is down.

3) if parent interface is admin up, sub interface can be set admin up or down independently.

This may cause kernel side sub interface admin state to be different from that from config in the following scenarios:

a) sub interface is configured admin_status up and parent interface admin_status config is changed from up to down

kernel side sub interface representation will be admin down according to 2) above.

Because shutdown a parent interface from sonic utility will admin down the physical port, even if at kernel side sub interface could be set admin up, packets will not be able to reach out to the wire from ASIC as physical port is in admin down status.

I think this scenario should fine for both data plane traffic and control plane traffic.

b) sub interface is configured admin_status down and parent interface admin_status config is changed from down to up

In this scenario, kernel side sub interface representation will be admin up according to 1) above. Kernel and application would think the sub interface is up and allows packets to be sent to the ASIC.

For packets coming from CPU, our current experiment shows that switch ASIC behaves as a NIC in processing such packets---even if from-cpu packets go through the pipeline, they are sent intact out of the corresponding physical port without any pipeline modification.

Assume from step 3 to 5, you are operating on switch tp2:

  1. shutdown subport Ethernet124.10
  2. shutdown parent port Etherent124
  3. startup parent port Ethernet124

After step 5, you will be able to send traffic from tp2 to tp1 via Ethernet124.10 from the analysis above (sub interface is up at kernel side and from-cpu packets are sent transparently out of the physical port). However, in the opposite direction, packets from tp1 will go through the ASIC pipeline at tp2 and get dropped in ASIC as sub interface Ethernet124.10 in ASIC is admin down.

This matches the phenomenon you observe tp1 can receive traffic from tp2, tp2 can not receive any traffic. Can you @Quinnwangs confirm this is the test setting? And if so, we will move forward with a fix.