nxp-archive / openil_linuxptp

PTP IEEE 1588 stack for Linux
GNU General Public License v2.0
136 stars 60 forks source link

Hardware Clock Synchronization Issue(NXP LS1021A TSN Switch) #3

Closed saiguvvala closed 5 years ago

saiguvvala commented 5 years ago

Hi, I am working on the NXP LS1021A TSN switch and facing issues while trying to sync the hardware clock of NXP switch with other different board. I am getting high offset values while using PTP4L commands. Can you help me out by providing the information about decreasing the offset values.

I have run the commands ptp4l -i eth0 -m on the master device and ptp4l -i eth0 -m -s on the slave device Output log: ptp4l[2670.323]: selected /dev/ptp0 as PTP clock ptp4l[2670.344]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[2670.344]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[2671.907]: port 1: new foreign master 00049f.fffe.ef0606-1 ptp4l[2675.908]: selected best master clock 00049f.fffe.ef0606 ptp4l[2675.908]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[2677.908]: master offset 1814389063 s0 freq +1000000 path delay -11282235 ptp4l[2678.908]: master offset 1886559517 s1 freq +1000000 path delay -20511764 ptp4l[2680.908]: master offset 121269846 s2 freq +1000000 path delay -15896999 ptp4l[2680.908]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[2681.908]: master offset 184210931 s2 freq +1000000 path delay -15896999 ptp4l[2682.908]: master offset 251765244 s2 freq +1000000 path delay -20511764 ptp4l[2683.908]: master offset 314706772 s2 freq +1000000 path delay -20511764 ptp4l[2684.908]: master offset 377648525 s2 freq +1000000 path delay -20511764 ptp4l[2685.908]: master offset 440589373 s2 freq +1000000 path delay -20511764 ptp4l[2686.908]: master offset 504437035 s2 freq +1000000 path delay -21418673 ptp4l[2687.908]: master offset 567378447 s2 freq +1000000 path delay -21418673 ptp4l[2688.908]: master offset 630240205 s2 freq +1000000 path delay -21339819 ptp4l[2689.908]: master offset 693181169 s2 freq +1000000 path delay -21339819 ptp4l[2690.909]: master offset 755680662 s2 freq +1000000 path delay -20897631 ptp4l[2691.909]: master offset 818622550 s2 freq +1000000 path delay -20897631 ptp4l[2692.909]: master offset 879851362 s2 freq +1000000 path delay -19185932 ptp4l[2693.909]: master offset 944503921 s2 freq +1000000 path delay -20897631 ptp4l[2694.909]: master offset 1007954619 s2 freq +1000000 path delay -21408405 ptp4l[2695.909]: master offset 1070896194 s2 freq +1000000 path delay -21408405 ptp4l[2696.909]: master offset 1132568476 s2 freq +1000000 path delay -20138162 ptp4l[2697.909]: master offset 1194557077 s2 freq +1000000 path delay -19185932 ptp4l[2698.909]: master offset 1257321980 s2 freq +1000000 path delay -19009231 ptp4l[2699.909]: master offset 1320774158 s2 freq +1000000 path delay -19520005 ptp4l[2700.909]: master offset 1386472022 s2 freq +1000000 path delay -22276805 ptp4l[2701.909]: master offset 1451115371 s2 freq +1000000 path delay -23979093 ptp4l[2702.909]: master offset 1512354701 s2 freq +1000000 path delay -22276805 ptp4l[2703.909]: master offset 1571665092 s2 freq +1000000 path delay -18645601 ptp4l[2704.910]: master offset 1626802503 s2 freq +1000000 path delay -10840052 ptp4l[2705.910]: master offset 1689743412 s2 freq +1000000 path delay -10840052 ptp4l[2706.910]: master offset 1751953893 s2 freq +1000000 path delay -10110519 ptp4l[2707.910]: master offset 1814895418 s2 freq +1000000 path delay -10110519

The kernel version and tag are

[root@OpenIL:misc]# uname -v

1 SMP Fri Feb 1 17:11:55 CST 2019

[root@OpenIL:misc]# uname -r 4.14.47-ipipe [root@OpenIL:misc]#

vladimiroltean commented 5 years ago

You say "TSN switch" which on this board is the SJA1105T, but then you show commands for PTP synchronization over the non-switched eth0 (eTSEC) port. Is the switch involved in any way at all in this sync process (does it forward any PTP packets)? The switched ports are labeled ETH2, ETH3, ETH4, ETH5 on the chassis, while the non-switched ports are ETH0 and ETH1.

saiguvvala commented 5 years ago

We are using eth0. We are open to suggestions for synchronization with even the switch ports

saiguvvala commented 5 years ago

Switch is not involved in sync process. We are trying to sync NXP LS1021ATSN-PA device(using ETH0) with an other different device by using PTP4L. As you mentioned in the other thread, PTP4L gives stable results when the initial offset between the hardware clocks is very less. We tried to update the hardware clock of the NXP device(using hwclock cmd) to match its value with the other device's clock. But we couldn't. As per the support team, RTC is not available/implemented on this device.

Could you please help us with the below issues:

  1. How to access, modify, update the hardware clock of this NXP device. Please share the steps.
  2. How to verify if the NXP device and the other device are in sync. Please share the steps.
  3. Is there any way other than PTP4L logs to verify the synchronization between the both devices. Please share the steps.
  4. It seems SoC clock tries to sync Switch clock before forwarding the packets through switch ports. Which ports give better sync results. Is it Switch ports or SoC ports. How do we decide.
  5. If you are going to suggest to use switch ports, then please share the steps to follow to do it. Also please share the respective switch port ids such seth# for ETH2-5.

Thanks in advance.

vladimiroltean commented 5 years ago

Thank you for your patience. I tested the kernel tag OpenIL-linux-201901 and found the bug. The reason is this recent patch on the ptp_qoriq driver: ptp_qoriq: support time offset register using. The 1588 clock driver is common on all Layerscape chips, but apparently the LS1021 doesn't support the TMROFF register. So when this patch is in place, stepping the clock is a no-op, and if the stepping is broken, the servo can't steer the clock fast enough via frequency adjustment to compensate for that (although that's what it's trying to do). As you noticed, sometimes clock synchronization can be achieved but most of the time it can't. This is due to the amount of initial offset the slave has to compensate for. So to consistently see that synchronization is broken, one has to run phc_ctl /dev/ptp0 set 0 first, in order to maximize the offset. Reverting the patch has no functional consequence and it fixes your issue, so for the moment please do that.

vladimiroltean commented 5 years ago

To address the rest of your questions:

How to access, modify, update the hardware clock of this NXP device. Please share the steps.

You can interact with the /dev/ptp0 clock using a rich kernel UAPI in C, or use the phc_ctl program from the linuxptp suite (or see how phc_ctl does that).

How to verify if the NXP device and the other device are in sync. Please share the steps.

Inspecting the ptp4l log is one way, inspecting phc_ctl /dev/ptp0 get is another.

It seems SoC clock tries to sync Switch clock before forwarding the packets through switch ports. Which ports give better sync results. Is it Switch ports or SoC ports. How do we decide. If you are going to suggest to use switch ports, then please share the steps to follow to do it. Also please share the respective switch port ids such seth# for ETH2-5.

PTP synchronization over switch ports is in the works, not yet available. For now please use the ETH0 or ETH1 port.

saiguvvala commented 5 years ago

I am using the ETH0 and ETH1 ports for running PTP4L but getting high offset values. Used PHC_CTL to set hardware clock value but the Offset value is not stable. Can you please suggest me solutions for reducing the offset value.

vladimiroltean commented 5 years ago

Have you tried reverting the patch, as suggested?

saiguvvala commented 5 years ago

Yes i have tried the patch that you have suggested, but getting the same results.

vladimiroltean commented 5 years ago

i have tried the patch that you have suggested

Just to be clear, the command you ran was git revert 0301d8d, right?

saiguvvala commented 5 years ago

Where should i apply this command in terminal or GITHUB. I am not a github user and i was manually updating source code files in my local machine.

vladimiroltean commented 5 years ago

I assumed you already had a kernel tree with git history checked out. If you don't, do the following on your PC:

git clone git@github.com:openil/linux.git openil-linux
cd openil-linux
git checkout master
git revert 0301d8d

Now go to your OpenIL source tree. The commands below will tell the build system to use the kernel sources that you just downloaded, instead of the ones it tries to build by default (which are not tracked by git).

cd openil
make nxp_ls1021atsn_defconfig
export LINUX_OVERRIDE_SRCDIR=$HOME/openil-linux
make linux-rebuild
make

Now either reflash the output/images/sdcard.img file or just the ls1021a-tsn.dtb and uImage files (which go to the mmcblk0p1 vfat partition.

saiguvvala commented 5 years ago

I have done the things what you have suggested and run the ptp4l on both master(NXP) and slave(other device)

  1. log on the slave device is root@am57xx-evm:~# ptp4l -i eth0 -m -s ptp4l[728.483]: selected /dev/ptp0 as PTP clock ptp4l[728.505]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[728.506]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[729.577]: port 1: new foreign master 00049f.fffe.ef0606-1 ptp4l[733.577]: selected best master clock 00049f.fffe.ef0606 ptp4l[733.578]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE ptp4l[734.577]: master offset 4972345492 s0 freq +1000000 path delay 0 ptp4l[735.578]: master offset 5054742146 s1 freq +1000000 path delay -19455340 ptp4l[736.578]: master offset 62941356 s2 freq +1000000 path delay -19455340 ptp4l[736.578]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED ptp4l[737.578]: master offset 125882679 s2 freq +1000000 path delay -19455340 ptp4l[738.578]: master offset 180547214 s2 freq +1000000 path delay -11178467 ptp4l[739.578]: master offset 245757156 s2 freq +1000000 path delay -13446977 ptp4l[740.578]: master offset 312455200 s2 freq +1000000 path delay -17200861 ptp4l[741.578]: master offset 375396363 s2 freq +1000000 path delay -17200861 ptp4l[742.578]: master offset 438336658 s2 freq +1000000 path delay -17200861 ptp4l[743.578]: master offset 501278069 s2 freq +1000000 path delay -17200861 ptp4l[744.578]: master offset 564115361 s2 freq +1000000 path delay -17096547 ptp4l[745.578]: master offset 627056756 s2 freq +1000000 path delay -17096547 ptp4l[746.578]: master offset 689998242 s2 freq +1000000 path delay -17096547 ptp4l[747.578]: master offset 752939650 s2 freq +1000000 path delay -17096547 ptp4l[748.579]: master offset 814577006 s2 freq +1000000 path delay -15792325 ptp4l[749.579]: master offset 875083075 s2 freq +1000000 path delay -13356938 ptp4l[750.579]: master offset 938026244 s2 freq +1000000 path delay -13356938 ptp4l[751.579]: master offset 998707404 s2 freq +1000000 path delay -11099094 ptp4l[752.579]: master offset 1061648116 s2 freq +1000000 path delay -11099094 ptp4l[753.579]: master offset 1124589375 s2 freq +1000000 path delay -11099094 ptp4l[754.579]: master offset 1186676330 s2 freq +1000000 path delay -10245257 ptp4l[755.579]: master offset 1247604905 s2 freq +1000000 path delay -8233205 ptp4l[756.579]: master offset 1310545589 s2 freq +1000000 path delay -8233205 ptp4l[757.579]: master offset 1373800861 s2 freq +1000000 path delay -8547853

  2. log on the master (NXP) is

[root@OpenIL:~]# ptp4l -i eth0 -m ptp4l[968.482]: selected /dev/ptp0 as PTP clock ptp4l[968.484]: driver changed our HWTSTAMP options ptp4l[968.484]: tx_type 1 not 1 ptp4l[968.484]: rx_filter 1 not 12 ptp4l[968.485]: port 1: INITIALIZING to LISTENING on INITIALIZE ptp4l[968.485]: port 0: INITIALIZING to LISTENING on INITIALIZE ptp4l[968.485]: port 1: link up ptp4l[975.134]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[975.135]: selected best master clock 00049f.fffe.ef0606 ptp4l[975.135]: assuming the grand master role

There is no difference in the offset and delay values. I am still getting high values as you can see in the log.

saiguvvala commented 5 years ago

With respect to this:

How to access, modify, update the hardware clock of this NXP device. Please share the steps.

You can interact with the /dev/ptp0 clock using a rich kernel UAPI in C, or use the phc_ctl program from the linuxptp suite (or see how phc_ctl does that).

Can you tell me how to do this a little more elaborately? I am not able to follow.

On a further note, the two devices I use are set on two different times by default: Oct 3 2016 and Jan 1 1970. Is there any way that I can bring them both to the current date?

vladimiroltean commented 5 years ago

I will have another look at this tomorrow. I agree it is also strange that there is no RTC on the board to keep time persistently, especially since there is a coin cell battery in there. Will double check that.

vladimiroltean commented 5 years ago

I didn't realize that you were testing the LS1021A-TSN board in master mode only. The patch I suggested you to revert was for slave mode only (offset correction). That explains why you see no difference. But I am able to synchronize an e1000e slave to the LS1021A-TSN master with no problem. Maybe there are limitations in the AM57xx as to how much initial offset it can correct. If the times are closer do they sync?

So your question is how to modify the PHC time manually? Have you tried something like this?

$ phc_ctl /dev/ptp0 set 1555000000
phc_ctl[4128.703]: set clock time to 1555000000.000000000 or Thu Apr 11 16:26:40 2019

I still don't have an answer for the RTC question.

saiguvvala commented 5 years ago

We had tried this before to set the time, but sync was not happening and no change observed in the offset values, Thought both commands "phc_ctl set" and hwclock are not alternatives for each other as the later one works on real time clock unlike the previous one. One system can have multiple PTP devices where as it can have only one RTC. So could you please let us know if updating the PTP device clocks results in updating the real time clock. Is it same or different. Please enlighten us with the background details.

vladimiroltean commented 5 years ago

I understand the differences between an RTC and a PHC. The LS1021A-TSN board does not have an RTC (therefore the system time at each boot starts from Jan 1st 1970). You can either use NTP to pick up system time at boot, or an external I2C RTC in the form of an Arduino shield, out of the selection of chips that have Linux drivers for them: https://github.com/openil/linux/tree/master/drivers/rtc

vladimiroltean commented 5 years ago

To bring some closure on this issue, I purchased some ZS-042 modules which feature the Maxim/Dallas DS3231 RTC with an I2C interface. IMG_20190514_114658 The initial plan was to solder an additional pin header (the white one in the picture above) and simply plug the RTC module inside the Arduino pin header internal to the board chassis:

                                  (ARD_IIC2_SCL) 10  J4
                                  (ARD_IIC2_SDA)  9
                                        (TP2/NC)  8
                                           (GND)  7
 J3  1 (NC)                           (SPI1_SCK)  6
     2 (3V3_ARD)                  (SPI1_ARD_SIN)  5
     3 (ARD_SHIELD_RST_B)            (SPI1_SOUT)  4
     4 (3V3_ARD)                     (SPI1_PCS2)  3
     5 (5V0_ARD)                            (NC)  2
     6 (GND)                                (NC)  1
     7 (GND)
     8 (12V0_ARD)                      (ARD_PD7)  8  J6
                                       (ARD_PD6)  7
 J5  1 (ADC_CH0)                       (ARD_PD5)  6
     2 (ADC_CH1)                       (ARD_PD4)  5
     3 (ADC_CH2)                       (ARD_PD3)  4
     4 (ADC_CH3)                       (ARD_PD2)  3
     5 (ARD_IIC2_SDA)        (LPUART4_CONN_SOUT)  2
     6 (ARD_IIC2_SCL)         (LPUART4_CONN_SIN)  1

                       (SPI1_SCK)
                            |
                            v    H4
   (ARD_SHIELD_RST_B) -> 5  3  1 <- (SPI1_ARD_SIN)
                (GND) -> 6  4  2 <- (5V0_ARD)
                            ^
                            |
                       (SPI1_SOUT)

It would have been a perfect fit into pins 7-10 of J4, while VDD would have needed to be supplied from pin 4 of J3 (3V3_ARD) from the other side.

However I ran into a board issue where the I2C interface does not work over the Arduino pin header. This is because the LS1021A connects to two TI PCA9515B I2C repeaters put in series, but the TI datasheet explicitly says not to connect them in series, because they won't see each other's 'buffered low' output as a valid low. So of course, the DS3231 over the Arduino pin header is not seen by the LS1021A.

One may try to replace one of the PCA9515B chips (the U19 part, to be precise) with a pin-compatible TCA9517 part, which presumably does support cascading. But I did not attempt this.

Instead there is another pin header at the rear of the board, called 'expansion header'. The pin layout (as seen by looking at the board from behind) is:

          (GND) ------------------------------+  +------------------------------ (LPUART1_SIN)
   (EXP1_GPIO7) ---------------------------+  |  |  +--------------------------- (IIC2_SDA)
   (EXP1_GPIO6) ------------------------+  |  |  |  |  +------------------------ (GND)
          (GND) ---------------------+  |  |  |  |  |  |  +--------------------- (VCC_3V3)
          (GND) ------------------+  |  |  |  |  |  |  |  |  +------------------ (CAN3_RX_EXP)
          (GND) ---------------+  |  |  |  |  |  |  |  |  |  |  +--------------- (GND)
    (SPI1_PCS0) ------------+  |  |  |  |  |  |  |  |  |  |  |  |  +------------ (NC)
     (SPI1_SCK) ---------+  |  |  |  |  |  |  |  |  |  |  |  |  |  |  +--------- (AUDIO_LIN_L)
          (GND) ------+  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  +------ (AUDIO_LIN_R)
 (VCC_5V0_LOAD) ---+  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  +--- (AC_AGND)
                   |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
                   v  v  v  v  v  v  v  v  v  v  v  v  v  v  v  v  v  v  v  v
                   2  4  6  8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
                   1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
                   ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^  ^
                   |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
 (VCC_5V0_LOAD) ---+  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  +--- (AC_AGND)
          (GND) ------+  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  +------ (AUDIO_LOUT_R)
 (SPI1_EXP_SIN) ---------+  |  |  |  |  |  |  |  |  |  |  |  |  |  |  +--------- (AUDIO_LOUT_L)
    (SPI1_SOUT) ------------+  |  |  |  |  |  |  |  |  |  |  |  |  +------------ (NC)
          (GND) ---------------+  |  |  |  |  |  |  |  |  |  |  +--------------- (GND)
           (NC) ------------------+  |  |  |  |  |  |  |  |  +------------------ (CAN3_TX_EXP)
           (NC) ---------------------+  |  |  |  |  |  |  +--------------------- (VCC_3V3)
          (GND) ------------------------+  |  |  |  |  +------------------------ (GND)
           (NC) ---------------------------+  |  |  +--------------------------- (IIC2_SCL)
           (NC) ------------------------------+  +------------------------------ (LPUART1_SOUT)

I proceeded to connect the ZS-042 module to the expansion header via pins 23 (IIC2_SCL), 24 (IIC2_SDA), 26 (GND) and 28 (VCC_3V3) using jump wires and it worked. I had to enable CONFIG_RTC_DRV_DS1307=y as well as add this entry to the DTS:

&i2c0 {
    rtc@68 {
        compatible = "maxim,ds3231";
        reg = <0x68>;
    };
};

The result is that I now have an external battery-powered RTC connected through jump wires to the expansion header:

[root@OpenIL:~]# dmesg
[    2.108657] DSA: tree 0 setup
[    2.111743] sja1105 spi0.1: Link is Up - 1Gbps/Full - flow control off
[    2.120139] rtc-ds1307 0-0068: setting system clock to 2019-06-09T23:42:51 UTC (1560123771)
[    2.129936] ALSA device list:
[    2.132894]   No soundcards found.
[    2.224864] hub 1-1:1.0: USB hub found
[    2.228783] hub 1-1:1.0: 3 ports detected
[    2.305990] EXT4-fs (mmcblk0p2): recovery complete
[    2.318836] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null)
[root@OpenIL:~]# hwclock 
Mon Jun 10 13:47:46 2019  0.000000 seconds

Please note that on the ZS-042 module, I had to remove the Zener diode as well as the 4K7 I2C pull-up resistor, as documented in many places on the Internet. This rework is not visible in the picture I uploaded.

I hope this is helpful for you in trying to evaluate TSN on the LS1021A-TSN board.

vladimiroltean commented 5 years ago

If you are going to suggest to use switch ports, then please share the steps to follow to do it. Also please share the respective switch port ids such seth# for ETH2-5.

The current way to evaluate PTP over switch ports is to use the upstream linuxptp automotive-master.cfg or automotive-slave.cfg and the net-next Linux kernel and this DTS:

[root@OpenIL:~]# ptp4l -i swp2 -m -f automotive-slave.cfg
(...)
ptp4l[48.343]: master offset          5 s2 freq  +83715 path delay       484
ptp4l[48.468]: master offset         -3 s2 freq  +83705 path delay       485
ptp4l[48.593]: master offset          0 s2 freq  +83708 path delay       485
ptp4l[48.718]: master offset          1 s2 freq  +83710 path delay       485
ptp4l[48.844]: master offset          1 s2 freq  +83710 path delay       485
ptp4l[48.969]: master offset         -5 s2 freq  +83702 path delay       485
ptp4l[49.094]: master offset          3 s2 freq  +83712 path delay       485
ptp4l[49.219]: master offset          4 s2 freq  +83714 path delay       485
ptp4l[49.344]: master offset         -5 s2 freq  +83702 path delay       485
ptp4l[49.469]: master offset          3 s2 freq  +83713 path delay       487