How to ensure that the sending time of the data packet is completely consistent with the gating opening time of the TSN switch?

15367060916 commented 3 years ago

My experimental scene: Regularly issue the TSN switch gate configuration open command, and send the data packet by adjusting the irochron command parameters, so that the time of the packet arriving at the TSN switch port is exactly the same as the gate opening time. Therefore, I would like to ask what is the meaning of basetime,shifttime, cycletime how should these parameters be set? At the same time, the basetime is based on when to start the calculation.

15367060916 commented 3 years ago

Detailed experimental scene I am trying to verify TSN features on LS1021A-TSN board. Two hosts are connected through the 1021-TSN board, just like fig27 in and they have been synchronized through ptp4l. Through wireshark's packet grabbing, I want to verify and test the traffic shaping function of the tsn switch.

My isochron traffic generated by Host 1 looks as follows: isochron send --interface eth0 --dmac 2c:53:4a:07:df:07 -p 6 -vid 256 -b 1621501757000000000 -cycle-time 1000 -n 10 -s 64 -C 192.168.1.60 -q & isochron send --interface eth0 --dmac 2c:53:4a:07:df:07 -p 5 -vid 256 -b 1621501757000000000 -cycle-time 1000 -n 10 -s 64 -C 192.168.1.60 -q & isochron send --interface eth0 --dmac 2c:53:4a:07:df:07 -p 4 -vid 256 -b 1621501757000000000 -cycle-time 1000 -n 10 -s 64 -C 192.168.1.60 -q & isochron send --interface eth0 --dmac 2c:53:4a:07:df:07 -p 3 -vid 256 -b 1621501757000000000 -cycle-time 1000 -n 10 -s 64 -C 192.168.1.60 -q & isochron send --interface eth0 --dmac 2c:53:4a:07:df:07 -p 2 -vid 256 -b 1621501757000000000 -cycle-time 1000 -n 10 -s 64 -C 192.168.1.60 -q & isochron send --interface eth0 --dmac 2c:53:4a:07:df:07 -p 1 -vid 256 -b 1621501757000000000 -cycle-time 1000 -n 10 -s 64 -C 192.168.1.60 -q & isochron send --interface eth0 --dmac 2c:53:4a:07:df:07 -p 0 -vid 256 -b 1621501757000000000 -cycle-time 1000 -n 10 -s 64 -C 192.168.1.60 -q & In the above command, I use multi-thread concurrency and send a frame length of 64 bytes with 7 priorities of priority 0-6, and 10 frames of data for each priority.

In the LS1021A-TSN board gated configuration, I made the following settings: tc qdisc add dev swp3 parent root handle 256 taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 1621501757000000000 \ sched-entry S c0 100000 \ sched-entry S a0 100000 \ sched-entry S 90 100000 \ sched-entry S 88 100000 \ sched-entry S 84 100000 \ sched-entry S 82 100000 \ sched-entry S 81 100000 \ flags 2

In the case of ensuring ptp synchronization, turn on the gating of priority 6-5-4-3-2-1-0 in turn, and the opening time of a single priority gating is sufficient to ensure the transit time of all priority data frames set by irochron to test the traffic shaping function of the switch. Notice that I set the irochron basetime parameter exactly the same as the start time of the LS1021A-TSN board gated command (accurate to nanosecond level). However, grasping packets at the host2 end through wireshark does not achieve the desired traffic shaping effect. Therefore, I want to know what the problem is. How several parameters of irochron should be configured. Thank you very much for your answer.

15367060916 commented 3 years ago

@vladimiroltean

15367060916 commented 3 years ago

@roednix

vladimiroltean commented 3 years ago

Hi, All time units are in nanoseconds, so "--cycle-time 1000" means "send a packet every 1 us". That is very unrealistic using a Linux user space program. The expected usage pattern is: (a) set up a network schedule and install it on the switches. You have that, it looks like this:

base-time 1621501757000000000
sched-entry S c0 100000
sched-entry S a0 100000
sched-entry S 90 100000
sched-entry S 88 100000
sched-entry S 84 100000
sched-entry S 82 100000
sched-entry S 81 100000

By the way, for simplicity, you can just set the base-time as 0, and the switch will auto-advance it into the nearest future time which is a multiple of the cycle-time (see below). (b) Calculate its cycle-time (the sum of all sched-entries). In this case, it is 700000 ns = 700 us. (c) Make sure to run ptp4l on the switch, and phc2sys+ptp4l for the end systems. This ensures that CLOCK_REALTIME/CLOCK_TAI (the software clocks) are in sync with /dev/ptp0. Ideally you would monitor the synchronization offset and send only when it is within +/- 50 ns. (d) Align every sender to the time slot on the switch. Something to keep in mind is that there is a latency between when the packet is sent by the end station and when it is received by the switch. This is the Ethernet path delay for the link, and you can derive this from the ptp4l output. Let's assume a path delay of 1000 ns (1 us). If you want the sender to enqueue a packet in this sched-entry:

sched-entry S c0 100000

aka in TC 6, then you need an isochron command as follows:

isochron send --interface eth0 --dmac 2c:53:4a:07:df:07 -p 6 --vid 256 --base-time 0 --cycle-time 700000 --window-size 100000 -n 10 -s 64 -C 192.168.1.60 -q

See? the cycle-time is 700 us, so this process will send one packet every cycle. The base-time is zero to align with the fact that it is the first time slot of the schedule on the switch. Because you did not specify any advance-time (nor is it recommended to), isochron will choose the maximum safe amount of time in advance to send the packet: that is 700000 - 100000 = 60 us. Including the kernel processing overhead, transmission overhead and path delay, it should arrive at the switch ingress port before the time slot opens. You can play around to reduce the cycle-time once you get a basic setup working with large cycle-time values.

Alternatively, let's say you want to enqueue into TC 3 (sched-entry S 88). For isochron to send into this traffic class, you need to change the -p to 3, and the base-time to 300000 ns (everything is the same except shifted to the right). Alternatively you can keep the base-time as 0 and specify the --shift-time as 300000. The main mistake in your command is the cycle-time that is very small and misaligned with the schedule.

(e) There is also the scheduling aspect. It is important, for tight cycle times to be accurate, that multiple isochron instances run on separate CPUs, and for isochron to use a real-time scheduling policy and a high priority. The first is achieved as follows: "taskset 01 isochron send --interface ..." will affine this isochron process to CPU #0. The second is done as follows: "isochron send --interface ... --sched-rr --sched-priority 98". Be careful that you can starve your system with high isochron CPU utilization.

(f) For even better performance, you can build a kernel with PREEMPT_RT, run cyclictest to make sure the worst-case interrupt wakeup latencies are low (ideally around 10 us or lower), and then reserve some CPUs to prevent the Linux process scheduler from putting processes on them automatically. For this you need to add "isolcpus=0,1,2" to the kernel boot-time command line (this should then be visible in "cat /proc/cmdline"), and you can then schedule the isochron processes on those isolated CPUs with taskset. If you want to go even crazier, you can disable the kernel's Read-Copy-Update (RCU) callbacks from running on those isolated CPUs by adding this additional command line argument: "rcu_nocbs=0,1,2".

15367060916 commented 3 years ago

Thank you very much for your help.I did the experiment according to the isochron command you mentioned.

I synchronized ptp4l on the switch, both servers performed phc2sys synchronization and the ptp4l, synchronization message is normal. Why does the report message always prompt phc2sys error when I send the isochron command, and the isochron sends the command and error message as shown below:

what is the reason and how to solve it?

2.The switch gate configuration commands are:

tc qdisc add dev swp3 parent root handle 256 taprio \

num_tc 8 \

map 0 1 2 3 4 5 6 7 \

queues @. @. @. @. @. @. @. @. \

base-time 0 \

sched-entry S c0 1000000 \

sched-entry S a0 1000000 \

sched-entry S 90 1000000 \

flags 2 If the packet is grabbed by wireshark at the receiving end, the resulting data packet should be in the order of 6, 5, 4, 6, 5, 4...respectively. The ideal packet capture result should be the following figure: But in fact, the result of my packet capture data is as follows：

Even, sometimes frame drops occur, such as not receiving a frame with a priority of 4. (if the switch is not connected, both servers will receive and drop frames if they are directly connected. ) wireshark screenshot is as follows: What is the problem of dropping frames and how to solve them?

Below is a screenshot of the synchronization information of my sender and receiver ptp4l and phc2sys. The synchronization has been relatively stable and maintained at a small value.

The synchronization command is entered according to OpenILUG_Rev1.9.pdf:

Last time you replied that setting the basetime=0, switch will automatically advance it to the nearest future time, which is a multiple of the cycle time. However, due to Ethernet path delay, kernel processing overhead, transmission overhead, and so on, when the data from the sender arrives at the switch port, time alignment will not be achieved, resulting in the wrong priority order of shaping traffic. How to solve the problem of time slot alignment.

At 2021-05-27 22:02:59, "Vladimir Oltean" @.***> wrote:

Hi, All time units are in nanoseconds, so "--cycle-time 1000" means "send a packet every 1 us". That is very unrealistic using a Linux user space program. The expected usage pattern is: (a) set up a network schedule and install it on the switches. You have that, it looks like this:

base-time 1621501757000000000 sched-entry S c0 100000 sched-entry S a0 100000 sched-entry S 90 100000 sched-entry S 88 100000 sched-entry S 84 100000 sched-entry S 82 100000 sched-entry S 81 100000

By the way, for simplicity, you can just set the base-time as 0, and the switch will auto-advance it into the nearest future time which is a multiple of the cycle-time (see below). (b) Calculate its cycle-time (the sum of all sched-entries). In this case, it is 700000 ns = 700 us. (c) Make sure to run ptp4l on the switch, and phc2sys+ptp4l for the end systems. This ensures that CLOCK_REALTIME/CLOCK_TAI (the software clocks) are in sync with /dev/ptp0. Ideally you would monitor the synchronization offset and send only when it is within +/- 50 ns. (d) Align every sender to the time slot on the switch. Something to keep in mind is that there is a latency between when the packet is sent by the end station and when it is received by the switch. This is the Ethernet path delay for the link, and you can derive this from the ptp4l output. Let's assume a path delay of 1000 ns (1 us). If you want the sender to enqueue a packet in this sched-entry:

sched-entry S c0 100000

aka in TC 6, then you need an isochron command as follows:

isochron send --interface eth0 --dmac 2c:53:4a:07:df:07 -p 6 --vid 256 --base-time 0 --cycle-time 700000 --window-size 100000 -n 10 -s 64 -C 192.168.1.60 -q

See? the cycle-time is 700 us, so this process will send one packet every cycle. The base-time is zero to align with the fact that it is the first time slot of the schedule on the switch. Because you did not specify any advance-time (nor is it recommended to), isochron will choose the maximum safe amount of time in advance to send the packet: that is 700000 - 100000 = 60 us. Including the kernel processing overhead, transmission overhead and path delay, it should arrive at the switch ingress port before the time slot opens. You can play around to reduce the cycle-time once you get a basic setup working with large cycle-time values.

Alternatively, let's say you want to enqueue into TC 3 (sched-entry S 88). For isochron to send into this traffic class, you need to change the -p to 3, and the base-time to 300000 ns (everything is the same except shifted to the right). Alternatively you can keep the base-time as 0 and specify the --shift-time as 300000. The main mistake in your command is the cycle-time that is very small and misaligned with the schedule.

(e) There is also the scheduling aspect. It is important, for tight cycle times to be accurate, that multiple isochron instances run on separate CPUs, and for isochron to use a real-time scheduling policy and a high priority. The first is achieved as follows: "taskset 01 isochron send --interface ..." will affine this isochron process to CPU #0. The second is done as follows: "isochron send --interface ... --sched-rr --sched-priority 98". Be careful that you can starve your system with high isochron CPU utilization.

(f) For even better performance, you can build a kernel with PREEMPT_RT, run cyclictest to make sure the worst-case interrupt wakeup latencies are low (ideally around 10 us or lower), and then reserve some CPUs to prevent the Linux process scheduler from putting processes on them automatically. For this you need to add "isolcpus=0,1,2" to the kernel boot-time command line (this should then be visible in "cat /proc/cmdline"), and you can then schedule the isochron processes on those isolated CPUs with taskset. If you want to go even crazier, you can disable the kernel's Read-Copy-Update (RCU) callbacks from running on those isolated CPUs by adding this additional command line argument: "rcu_nocbs=0,1,2".

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

15367060916 commented 3 years ago

Thank you very much for your help.I did the experiment according to the isochron command you mentioned.

I synchronized ptp4l on the switch, both servers performed phc2sys synchronization and the ptp4l, synchronization message is normal. Why does the report message always prompt phc2sys error when I send the isochron command, and the isochron sends the command and error message as shown below: what is the reason and how to solve it?

2.The switch gate configuration commands are: tc qdisc add dev swp3 parent root handle 256 taprio \

num_tc 8 \

map 0 1 2 3 4 5 6 7 \

queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \

base-time 0 \

sched-entry S c0 1000000 \

sched-entry S a0 1000000 \

sched-entry S 90 1000000 \

flags 2 If the packet is grabbed by wireshark at the receiving end, the resulting data packet should be in the order of 6, 5, 4, 6, 5, 4...respectively. The ideal packet capture result should be the following figure:

But in fact, the result of my packet capture data is as follows：

Even, sometimes frame drops occur, such as not receiving a frame with a priority of 4. (if the switch is not connected, both servers will receive and drop frames if they are directly connected. ) wireshark screenshot is as follows:

What is the problem of dropping frames and how to solve them? Below is a screenshot of the synchronization information of my sender and receiver ptp4l and phc2sys. The synchronization has been relatively stable and maintained at a small value.

The synchronization command is entered according to OpenILUG_Rev1.9.pdf:

Last time you replied that setting the basetime=0, switch will automatically advance it to the nearest future time, which is a multiple of the cycle time. However, due to Ethernet path delay, kernel processing overhead, transmission overhead, and so on, when the data from the sender arrives at the switch port, time alignment will not be achieved, resulting in the wrong priority order of shaping traffic. How to solve the problem of time slot alignment.

15367060916 commented 3 years ago

@vladimiroltean

vladimiroltean commented 2 years ago

Question 1: "sender PHC not synchronized (time delta around 37 seconds)": 37 seconds is the UTC-to-TAI offset. What is the exact list of phc2sys arguments that you've used? By any chance did you use "-O 0"? Because the system clock and the PTP clock should really be 37 seconds apart. The isochron program adjusts software timestamps by what it thinks is the correct UTC-TAI offset, which in this case was 37 seconds, and what it's saying is that the system clock and the PTP clock were not apart by 37 seconds. In fact, they were synchronized to the exact same time. I would consider this a system configuration issue. Please consider using "phc2sys -a" (automatic mode), which queries the UTC offset from ptp4l. Also please consider updating to the latest version of isochron, it has brought some improvements to the TAI offset handling, and to the general monitoring of PTP sync status. I don't exclude a bug in the isochron version from that time, either.

Question 2: "What is the problem of dropping frames and how to solve them?": It might be that since isochron had a wrong understanding of the UTC-to-TAI offset, its entire schedule was misaligned with the network schedule by 37 seconds. And considering that the cycle-time of your network is 3 ms, 37 seconds is not wholly divisible by 3. So it would try to erratically send packets into what is basically the wrong time slot. Again, please try again with a newer isochron version, it is much clearer to identify possible misconfigurations.

Question 3: "However, due to Ethernet path delay, kernel processing overhead, transmission overhead, and so on, when the data from the sender arrives at the switch port, time alignment will not be achieved": The whole point of Qbv is to eliminate the kernel processing overhead by making the transmitter MAC buffer the frame until a precise moment in time. The only relevant source of delay, in this case, becomes the path delay (PHY propagation delay, forwarding delay, etc). My suggestion would be to make the time slots as large as the expected end-to-end path delay in your network, and not just large enough to cover a single point-to-point link. This way, you do not need to offset the time slots depending on which switch you are. Does that make sense?

vladimiroltean commented 2 years ago

In addition to the above response, I've added enough TX timestamp validation in the latest isochron master branch, that I now think that configuration issues leading to deadline misses are obvious. I will therefore close this ticket.

liing0228 commented 2 years ago

@15367060916 @vladimiroltean HI sir , i faced the same issue you guys mentioned , i found that the qbv cannot send packets of different priorities in chronological order just like your wireshark . Did you know how to fix it or what's the point? i will be grateful if you can tell me pls.

vladimiroltean commented 2 years ago

Please open a separate issue if you think you've found a problem with isochron, and explain what that issue is, not reply to another unrelated thread saying "I think I have the same issue pls help".

liing0228 commented 2 years ago

Please open a separate issue if you think you've found a problem with isochron, and explain what that issue is, not reply to another unrelated thread saying "I think I have the same issue pls help".

@vladimiroltean Sir I open the issue in openil,thanks for your reply. And I would like to ask what will occur if the packet didn’t arrive in the ingress port before the time slot open but arrive in when the time slot is opening with iperf3 background traffic ? Because I found that when background iperf3 traffic is occur 5-10% of my path delay will become unpredictable that is not as realistic as the result in openil user guide.But when there is no iperf3 traffic the path delay will become normal . I don’t know where should I occur the issue because I use another real-time traffic generatetor not isochron

NXP / isochron

How to ensure that the sending time of the data packet is completely consistent with the gating opening time of the TSN switch? #2