nxp-archive / openil

OpenIL is an open source project based on Buildroot and designed for embedded industrial solution.
Other
136 stars 55 forks source link

Connection is lost when vlan_filtering is active (LS1021A) #79

Closed emunicio closed 3 years ago

emunicio commented 3 years ago

Hi,

We are trying to test the Qbv performance with taprio, assigning different intervals for each queue

We have the following setup:

Node1 <------ (swp2) LS1021A (eth1) <------ Node2

We send packets from Node2 to Node1 in the VLAN 30, classifying the packets with iptables and giving to each a a priority with: ip link set eno1.30 type vlan egress 0:1

Packets seem to have the PCP header correct:

PCPheader

The taprio rule for scheduling is:

tc qdisc replace dev swp2 parent root handle 100 taprio num_tc 2 map 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 queues 1@0 1@1 base-time 0 sched-entry S 01 100000 sched-entry S 02 500000 clockid CLOCK_TAI

However when we analyze the traffic in Node2, we see all traffic falls in entry 1, regardless the priority. From the OpenIL Guide 1.9, we have seen that the vlan_filtering has to be activated in the switch with: ip link set dev br0 type bridge vlan_filtering 1

However this breaks the connectivity between Node1 and Node2. When we set it back to "0", the connectivity comes back again. We have tried to set the VLAN=30 in /etc/systemd/network/br0.netdev, and with bridge vlan add dev swp2 vid 30 pvid but the results seems to be the same.

Any idea of what could be going on?

Thanks and kind regards, Esteban

vladimiroltean commented 3 years ago

This will solve the loss of connectivity problem:

devlink dev param set spi/spi0.1 name best_effort_vlan_filtering value true cmode runtime

More details:

But it will not solve the taprio window problem. Because, see, you've misinterpreted the documentation. It doesn't say that the network stack cannot inject into a particular traffic class according to the skb->priority unless VLAN filtering is enabled. It says that the VLAN PCP from the outside world (aka the ingress of swp2) is ignored unless VLAN filtering is enabled.

I assume you have a bridge spanning between eth1 and swp2. The skb, when it is received at eth1, is classified according to the ingress-qos-map of eth1.30. So the skb->priority is set to 0. That's why the packet is injected into the sja1105 port with a PCP of 0. You want:

ip link set eno1.30 type vlan ingress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 egress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7

As well as a valid skb->priority to traffic class mapping at the taprio scheduler level:

tc qdisc add dev swp2 parent root taprio \
        num_tc 8 \
        map 0 1 2 3 4 5 6 7 \
        queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
        base-time ... \
        sched-entry ... \
        flags 2
emunicio commented 3 years ago

Hi Vladimir, thanks a lot for the quick answer!

Unfortunately, this: devlink dev param set spi/spi0.1 name best_effort_vlan_filtering value true cmode runtime , did not solve the problem.

Looking at this https://www.kernel.org/doc/html/latest/networking/dsa/sja1105.html it seems it's uses br0. Does the same applies for br1?

We have swpX ports on br1, and the eth0 in br0. Something like this:

testbed_tsn3

and our ip link is:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: can0: <NOARP,ECHO> mtu 16 qdisc noop state DOWN mode DEFAULT group default qlen 10
    link/can 
3: can1: <NOARP,ECHO> mtu 16 qdisc noop state DOWN mode DEFAULT group default qlen 10
    link/can 
4: can2: <NOARP,ECHO> mtu 16 qdisc noop state DOWN mode DEFAULT group default qlen 10
    link/can 
5: can3: <NOARP,ECHO> mtu 16 qdisc noop state DOWN mode DEFAULT group default qlen 10
    link/can 
6: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br0 state UP mode DEFAULT group default qlen 1000
    link/ether 00:04:9f:ef:06:06 brd ff:ff:ff:ff:ff:ff
7: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master br1 state UP mode DEFAULT group default qlen 1000
    link/ether 00:04:9f:ef:07:07 brd ff:ff:ff:ff:ff:ff
8: eth2: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1504 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:04:9f:ef:08:08 brd ff:ff:ff:ff:ff:ff
9: sit0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/sit 0.0.0.0 brd 0.0.0.0
10: swp5@eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master br1 state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
    link/ether 00:04:9f:ef:08:08 brd ff:ff:ff:ff:ff:ff
11: swp2@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1 state UP mode DEFAULT group default qlen 1000
    link/ether 00:04:9f:ef:08:08 brd ff:ff:ff:ff:ff:ff
12: swp3@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1 state UP mode DEFAULT group default qlen 1000
    link/ether 00:04:9f:ef:08:08 brd ff:ff:ff:ff:ff:ff
13: swp4@eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master br1 state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
    link/ether 00:04:9f:ef:08:08 brd ff:ff:ff:ff:ff:ff
14: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 0e:aa:71:0b:26:06 brd ff:ff:ff:ff:ff:ff
15: br1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 96:51:04:98:79:c4 brd ff:ff:ff:ff:ff:ff

Also the bridge vlan command gives:

port    vlan ids
eth0     1 PVID Egress Untagged

eth1     1 PVID Egress Untagged

swp5     1 PVID Egress Untagged

swp2     1 Egress Untagged
     30

swp3     1 PVID Egress Untagged

swp4     1 PVID Egress Untagged

br0  1 PVID Egress Untagged

br1  1 PVID Egress Untagged

I have also tried as well to send PCP tagged packets from a Node 3 (attached to swp3) to Node 1 (attached to swp2), and again, it only allowed traffic if vlan_filtering was 0. That is strange since they are in the same br1. With vlan_filtering = 0 and traffic flowing, the taprio schedule did not seem to work though (all are treated as queue 0 as you mentioned)

Finally, I am trying to understand you second comment:

It doesn't say that the network stack cannot inject into a particular traffic class according to the skb->priority unless VLAN filtering is enabled. It says that the VLAN PCP from the outside world (aka the ingress of swp2) is ignored unless VLAN filtering is enabled.

Does this mean that we can make our traffic flows to enter in the different queues of taprio without a PCP VLAN tag?

What I do to set the priority from my Node 2 is to use these iptables command:

iptables -t mangle -A POSTROUTING -p udp --destination-port 6677 --src 192.168.30.1 -j CLASSIFY --set-class 0:0
iptables -t mangle -A POSTROUTING -p udp --destination-port 6678 --src 192.168.30.1 -j CLASSIFY --set-class 0:1

so that the priority is correctly mapped later in the VLAN and hoping that the the switch will interpret correctly the PCP and put it in the right queue (e.g., queue 1). If I run in Node 2:

ip link set eno1.30 type vlan ingress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 egress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7

instead of:

ip link set eno1.30 type vlan egress 0:0 ip link set eno1.30 type vlan egress 1:1

the result seems to be the same.

Maybe I am missing something?

Thanks and kind regards, Esteban

vladimiroltean commented 3 years ago

Looking at this https://www.kernel.org/doc/html/latest/networking/dsa/sja1105.html it seems it's uses br0. Does the same applies for br1?

What's br1 more than a name? You could name the bridge interface Charlie as far as I'm concerned.

We have [...] the eth0 in br0

What's the purpose of a bridge with a single port? If you run "bridge link", what do you see?

vladimiroltean commented 3 years ago

What I do to set the priority from my Node 2 is to use these iptables command:

I'm not sure if netfilter hooks run on bridged traffic. Have you looked at ebtables?

emunicio commented 3 years ago

Looking at this https://www.kernel.org/doc/html/latest/networking/dsa/sja1105.html it seems it's uses br0. Does the same applies for br1?

What's br1 more than a name? You could name the bridge interface Charlie as far as I'm concerned.

We have [...] the eth0 in br0

What's the purpose of a bridge with a single port? If you run "bridge link", what do you see?

This is the output of "bridge link":

6: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 19 
7: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br1 state forwarding priority 32 cost 19 
10: swp5@eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 master br1 state disabled priority 32 cost 100 
11: swp2@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br1 state forwarding priority 32 cost 19 
12: swp3@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br1 state forwarding priority 32 cost 19 
13: swp4@eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 master br1 state disabled priority 32 cost 100 
emunicio commented 3 years ago

What I do to set the priority from my Node 2 is to use these iptables command:

I'm not sure if netfilter hooks run on bridged traffic. Have you looked at ebtables?

Sorry, my writting was a bit confusing. I meant "to set priority in my Node 2", so I am actually generating the traffic in Node 2, classifying at IP level there, and adding the PCP VLAN tag there as well, before leaving to the swtich. With tcpdump I see the VLAN tag is there, so normally it should arrive fine to the eth1 in the LS1021A

vladimiroltean commented 3 years ago

Aha, ok. There is no 8021q interface on the LS1021A-TSN. Just a VLAN-aware bridge. That helps to understand. You shouldn't have even mentioned how you generate the VLAN-tagged traffic, I don't care. The point is that it's tagged with VLAN, and you expect the VLAN tagging to be preserved into the skb->priority when forwarding.

The easiest way to include this functionality into your current setup would be to add some tc filters on the LS1021A-TSN:

tc qdisc add dev eth1 clsact
tc filter add dev eth1 ingress protocol 802.1Q flower vlan_prio 0 action skbedit priority 0
tc filter add dev eth1 ingress protocol 802.1Q flower vlan_prio 1 action skbedit priority 1
tc filter add dev eth1 ingress protocol 802.1Q flower vlan_prio 2 action skbedit priority 2
tc filter add dev eth1 ingress protocol 802.1Q flower vlan_prio 3 action skbedit priority 3
tc filter add dev eth1 ingress protocol 802.1Q flower vlan_prio 4 action skbedit priority 4
tc filter add dev eth1 ingress protocol 802.1Q flower vlan_prio 5 action skbedit priority 5
tc filter add dev eth1 ingress protocol 802.1Q flower vlan_prio 6 action skbedit priority 6
tc filter add dev eth1 ingress protocol 802.1Q flower vlan_prio 7 action skbedit priority 7

Of course, the tc filters are massively more powerful than that, and since you apply them on eth1 where there is no hardware offload for them, you can apply any filter you desire, you don't have to limit yourself to VLAN PCP. For example you can do software QoS classification directly based on IP:

tc filter add dev eth1 ingress protocol ipv4 flower src_ip 192.168.30.1 action skbedit priority 7

Some documentation: https://man7.org/linux/man-pages/man8/tc.8.html https://man7.org/linux/man-pages/man8/tc-flower.8.html

The skb->priority will be assigned on ingress into the network stack now, and it will be preserved until egress. At egress, the sja1105 driver will deliver into the desired taprio queue according to the skb->priority and the queue mapping. This will happen regardless of the vlan_filtering setting of the bridge.

What will not happen regardless of the vlan_filtering setting of the bridge is the QoS classification on the reverse path (ingress traffic for sja1105 ports).

vladimiroltean commented 3 years ago

I have also tried as well to send PCP tagged packets from a Node 3 (attached to swp3) to Node 1 (attached to swp2), and again, it only allowed traffic if vlan_filtering was 0. That is strange since they are in the same br1.

Is it strange? A bridge with vlan_filtering=1 applies VLAN filtering (acts like an 802.1Q bridge and not like an 802.1d bridge). So VLANs that are not in the ingress filtering list of swp3 and egress filtering list of swp2 will be dropped. I assume you didn't do this already:

bridge vlan add dev swp3 vid 30
bridge vlan add dev swp2 vid 30

maybe you should.

With vlan_filtering = 0 and traffic flowing, the taprio schedule did not seem to work though (all are treated as queue 0 as you mentioned)

Yes, this is expected. When VLAN awareness is not enabled in the sja1105 switch, it completely ignores the VLAN PCP and classifies all ingress traffic to the port-based default traffic class, which is 0. The only exception is egress traffic, which as I said in my earlier response, is classified to the correct traffic class regardless of the VLAN filtering setting of the bridge.

Finally, I am trying to understand you second comment:

It doesn't say that the network stack cannot inject into a particular traffic class according to the skb->priority unless VLAN filtering is enabled. It says that the VLAN PCP from the outside world (aka the ingress of swp2) is ignored unless VLAN filtering is enabled.

Does this mean that we can make our traffic flows to enter in the different queues of taprio without a PCP VLAN tag?

Yes, this is one implication. But only on TX from the network stack. Whatever is set in skb->priority for the packet in the kernel is what will be used for the taprio traffic class. But the reverse is not true. On ingress (packets coming from the outside world into a sja1105 port), QoS classification can only be performed based on VLAN PCP, and even that requires vlan_filtering to be 1.

emunicio commented 3 years ago

Thanks a lot for the explanation Vladimir! I understand now where the problem was. Now, everything is working ;-)