Closed Norman03 closed 4 years ago
There are many ways to set this up, depending on what you want, and Buildroot already supports several NTP clients. What stratum server will you be connecting to? Do you have a particular accuracy requirement? Do you need NTP or SNTP is fine? Have you tried to set up e.g. chrony + timemaster in conjunction with phc2sys? I think you are confusing some terms when you are asking about RTC, or at least I don't understand the request for an OpenIL release with "RTC fix". The RTC is a persistent clock source that is physically lacking on the board. The cores can keep system time while powered on but not while power is removed.
Why do you need both an RTC and NTP?
Thanks, As you suggested I'm able recompile the image with NTP kernel config enabled. I performed below steps and still see large PHC2SYS offset in my slave PC's
Using NTP in the NXP switch, the system clock is updated with the UTC current time but the hardware clock is not synced in the NXP switch. Still the hardware clock is epoch and PTP exchange this to all other slaves. Does my understanding is correct?
Can you explain me how to sync the NTP time with the hardware clock(epoch) in the NXP switch?
Regards, Norman
when I tried to sync the hardware clock(PTP) to system clock using PHC2SYS, Iam seeing large offset
What command are you using for that? What happens if you run phc2sys -a -r -r -m
?
Iam using this command in the PC phc2sys -s enp1s0 -w -mq
And what is the output/your expected result? The point in specifying -r twice is that phc2sys will serve the system time (disciplined by NTP) over PTP.
man phc2sys
-r Only valid together with the -a option. Instructs phc2sys to also synchronize the system clock (CLOCK_REALTIME). By default, the system clock is not considered as a possible time
source. If you want the system clock to be eligible to become a time source, specify the -r option twice.
Just a hint: if you want to use the device as a true PTP switch (including swp0 - swp3) you might want to use a community kernel. There are issues at the moment which prevent that from being integrated into the OpenIL 4.14 kernel version.
And what is the output/your expected result? The point in specifying -r twice is that phc2sys will serve the system time (disciplined by NTP) over PTP.
man phc2sys
-r Only valid together with the -a option. Instructs phc2sys to also synchronize the system clock (CLOCK_REALTIME). By default, the system clock is not considered as a possible time source. If you want the system clock to be eligible to become a time source, specify the -r option twice.
Still I'm getting the larger offset.
Thanks for sharing the kernel. I'm compiling the SD card image with this kernel and I try to sync the NXP switch(PTP master) with other PC (PTP slaves) and sync the PC system clock (PHC2SYS) with the PC PTP hardware clock.
Just a hint: if you want to use the device as a true PTP switch (including swp0 - swp3) you might want to use a community kernel. There are issues at the moment which prevent that from being integrated into the OpenIL 4.14 kernel version.
@vladimiroltean As you suggested, I cloned the community kernel and configured the make menuconfig
to compile this kernel. During the image compilation EULA package takes too long time to compile and also after so long hours there were no progress in the compilation. Ref the screen shot in the attachment.
Do you have any idea on this?
On one hand, you read 'ELUA' with a typo: it is eLua (embedded Lua interpreter) rather than EULA (end user license agreement).
On the other hand, I have no idea why you are building the efl package in the first place. Are you sure you are building for the nxp_ls1021atsn_defconfig
target?
But you do have a point, and this is that it's not trivial to test the community kernel. However you shouldn't expect it to be. But that doesn't mean the OpenIL target for the LS1021A-TSN isn't slightly messy right now - it is. So I took some time, I forked the OpenIL build system, and made it build the community kernel by default (along with many other cleanups and upgrades which were necessary to support newer packages such as iproute2-next).
I've build-tested the thing a few times, and I ran it for almost a day, but it's still possible there might be some issues. Please let me know if you're facing any problems with it.
Of course, to compile, just do:
make nxp_ls1021atsn_defconfig
make
# use output/images/sdcard.img
@vladimiroltean While building the sdcard.img using the commands suggested by you in the above thread, there were build errors when >>> host-nodejs 10.16.3
is building.
openIL version(community kernel): https://github.com/vladimiroltean/openil
Commands used to build:
make nxp_ls1021atsn_defconfig
make
Can you please help me to solve this?
/usr/bin/ld: final link failed: Symbol needs debug section which does not exist
That suggests a problem with the toolchain. I have not encountered that while building.
Did you by any chance update the repository in the middle of the compilation process? I made some force-pushes to it, including a toolchain change (externally downloaded -> compiled by buildroot) since it was needed for a change in kernel headers.
What is the HEAD of your master branch currently pointing at (git show
)? It should be at 846349af3cac99f095f2f52d0773ed588b512f35 (board: nxp: ls1021a-tsn: Use linux-headers 5.2 package from kernel.org
)
If it isn't pointing to this commit, could you please restart the build with the latest settings so that all packages are in a consistent state? That would entail:
rm -rf output
git fetch origin
git reset --hard origin/master
make nxp_ls1021atsn_defconfig
make
Sorry for the trouble with the force-pushing. I'll do further changes on a devel branch and try to keep a linear history for master.
@vladimiroltean 846349a is the commit in which the current build is happening. Now, In the kernel .config file I have changed the BR2_PACKAGE_NODEJS=y
to BR2_PACKAGE_NODEJS is not set
.
Basically I skipped the nodejs library. Now the build is happening. Is this a hard dependency package?
Ok, let me rephrase. Does this error occur with a completely clean build?
No, nodejs is a web server runtime. You don't need it. And it's not kernel config, it's buildroot config.
Ok, let me rephrase. Does this error occur with a completely clean build?
Just now, I started to rebuild after cleaning and fetching the latest master. Let you know if any error occurs.
No, nodejs is a web server runtime. You don't need it. And it's not kernel config, it's buildroot config.
Ok.
Thanks for your prompt support.
@vladimiroltean Meanwhile I have a doubt: After a successful built SD card image (community kernel) and I'm able to boot the LS1021ATSN switch. While boot up I'm able to see the following logs.
From this I'm able to understand,
Why EEPROM has invalid ID? Am I doing anything wrong here?
The EEPROM has invalid ID because that's how it comes out of the factory. I don't know why beyond that. It is documented in the U-Boot board README file how you can set the MAC addresses persistently. You only do it once.
@vladimiroltean, Now I'm trying to run ptp master in the LS1021ATSN switch and connect the other PC's as PTP slaves in TSN switched ports.
master: ptp4l -i eth2 -2 -mq
Slave: ptp4l -i eth1 -2 -mq -s
The slave was not able to detect the master node.
So the community version of OpenIL for LS1021A-TSN doesn't use /etc/init.d, but systemd (as you'll find out if you open the README file in that folder).
With the switch ports in the DSA kernel driver, you are not supposed to run ptp4l over eth2 (which is only a control interface). You are supposed to run ptp4l over swp2, swp3, swp4, swp5.
Look around first, list the Ethernet interfaces, make sure they are up (eth2 needs to be up in order for switch net devices to be up), put an IP on br0, see if you can ping, etc.
Read the DSA documentation and the driver documentation.
Then finally see the /lib/systemd/system/linuxptp.service
and /etc/linuxptp.cfg
files.
The linuxptp service has been customized for the switch to operate as a P2P_TC, since the device's primary use case is as a switch. It cannot be a grandmaster in this mode. Switches can only be grandmasters in 802.1AS, which is currently not a thing in linuxptp yet. So you might need to change it. The linuxptp-system-clock service (phc2sys) has not been customized at all. You will definitely need to adapt that.
To activate the services:
systemctl enable linuxptp
systemctl start linuxptp
systemctl enable linuxptp-system-clock
systemctl start linuxptp-system-clock
To monitor them:
journalctl -b -u linuxptp -f
journalctl -b -u linuxptp-system-clock -f
@vladimiroltean br0 interface is up and able to ping the slave PC's from the LS1021ATSN switch. Now I have connected 1PC(PTP master) and LSA1021TSN switch (PTP slave) and started the ptp service. Now the switch is not able to send delayed response
and I'm not able to see the offset in the log file.
In the log file selected local clock 00049f.fffe.ef0606 as best master
is logged.
Up? No. You need to bring it up with ip link set dev br0 up
(same as all others, I think).
Present? Yes, due to the systemd-networkd configuration files that are pre-installed.
@vladimiroltean Now I have connected 1PC(PTP master) and LSA1021TSN switch (PTP slave) and started the ptp service. Now the switch is not able to send delayed response and I'm not able to see the offset in the log file.
In the log file selected local clock 00049f.fffe.ef0606 as best master is logged.
selected /dev/ptp0 as PTP clock
That is the eTSEC PTP clock (eth0, eth1). Which ports have you kept in /etc/linuxptp.cfg
? It is possible to make the device be a switch across both /dev/ptp0
and /dev/ptp1
, but it is a bit more complicated: you will need another instance of phc2sys
that keeps them in sync.
Can you draw a diagram of the 1588 network you're trying to establish, so I can help you customize the ptp4l daemon accordingly?
Also please make sure that all devices are speaking the same protocol (1588 - not 802.1AS, L2 transport, peer delay).
@vladimiroltean, I trying to bring up this architecture.
ptp4l configuration file:
Architecture:
Ok, so swp2 is a 1588 endpoint (ordinary clock)?
In that case, what happens if you try the following in /etc/linuxptp
:
[global]
delay_mechanism P2P
clock_type OC
network_transport L2
time_stamping hardware
step_threshold 1.0
tx_timestamp_timeout 10
[swp2]
Please keep in mind that the UDPv4
transport you specified in the above picture will not work. You need to match transports in the entire PTP network, and the switch ports only support L2.
@vladimiroltean I tried the following. Still I'm getting the same logs. I'm trying to understand selected local clock 00049f.fffe.ef0808 as best master
log.
That message means that a timeout expired and the port saw no ANNOUNCE messages (or there were ANNOUNCE messages of lower priority) and BMCA decided that the grandmaster should be itself.
Let me ask you again: Is eth2 up? Is swp2 up? They both need to be brought up, in this order.
@vladimiroltean I'm able to see the interfaces were up.
Ok, I can reproduce the issue. Let me see what's going on.
My mistake, I did not actually reproduce an issue but I was testing on another port. It does work fine for me. I made a mistake and I wrote tx_timestamp_threshold
instead of tx_timestamp_timeout
above, but I corrected it.
Can you share more details about the PTP master? I still don't think they are using the same protocol. If you stop the linuxptp service, and tcpdump -i swp2
, do you see anything?
@vladimiroltean Iam using ptp4l -i <INFNAME> -mq -2
command in the PTP master.
After stopping the ptp4l service in LSA1021ATSN I'm able to see this;
If you had set up the devices as per the above suggestion, you would have received these warnings: On the switch:
ptp4l[3684.670]: port 1: delay request on P2P port
and on the Intel card:
ptp4l[6657118.241]: port 1: pdelay_req on E2E port
By looking at tcpdump I am now convinced that the Intel device speaks E2E over L2, like you indicated. But I am still not convinced that the switch speaks the same protocol. With the service still disabled, what happens if you run this on the switch:
[root@OpenIL ~] # ptp4l -i swp2 -2 -m --tx_timestamp_timeout 10 -s
ptp4l[3953.028]: selected /dev/ptp1 as PTP clock
ptp4l[3953.130]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[3953.131]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[3953.458]: port 1: new foreign master 6805ca.fffe.39cdca-1
ptp4l[3957.458]: selected best master clock 6805ca.fffe.39cdca
ptp4l[3957.458]: running in a temporal vortex
ptp4l[3957.458]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[3960.458]: master offset -18675993397429258 s0 freq +16672 path delay 3556
ptp4l[3961.458]: master offset -18675993397459726 s1 freq -13797 path delay 3760
ptp4l[3962.458]: master offset 170 s2 freq -13627 path delay 3760
ptp4l[3962.458]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[3963.458]: master offset 746 s2 freq -13000 path delay 3232
ptp4l[3964.458]: master offset 154 s2 freq -13368 path delay 3232
ptp4l[3965.458]: master offset 2068 s2 freq -11408 path delay 1118
ptp4l[3966.458]: master offset 814 s2 freq -12041 path delay 180
ptp4l[3967.458]: master offset -432 s2 freq -13043 path delay -142
ptp4l[3968.458]: master offset -714 s2 freq -13455 path delay -372
ptp4l[3969.458]: master offset -440 s2 freq -13395 path delay -782
ptp4l[3970.458]: master offset -664 s2 freq -13751 path delay -742
ptp4l[3971.458]: master offset -496 s2 freq -13782 path delay -742
ptp4l[3972.458]: master offset -288 s2 freq -13723 path delay -742
ptp4l[3973.458]: master offset -104 s2 freq -13625 path delay -782
ptp4l[3974.458]: master offset -36 s2 freq -13588 path delay -802
ptp4l[3975.458]: master offset 30 s2 freq -13533 path delay -844
ptp4l[3976.458]: master offset 6 s2 freq -13548 path delay -844
ptp4l[3977.458]: master offset -8 s2 freq -13560 path delay -854
ptp4l[3978.458]: master offset -32 s2 freq -13587 path delay -854
ptp4l[3979.458]: master offset -8 s2 freq -13572 path delay -854
ptp4l[3980.458]: master offset -8 s2 freq -13575 path delay -854
@vladimiroltean If i run
Master(intel card): ptp4l -i <INFNAME> -mq -2
Slave(LS1021ATSN): ptp4l -i swp2 -2 -m --tx_timestamp_timeout 10 -s
Iam able to get the master offset as what you get.
Well, I guess problem solved, then? Just transpose the slave settings into a blank /etc/linuxptp.cfg
file:
-2
becomes network_transport L2
--
from --tx_timestamp_timeout 10
-s
becomes slaveOnly 1
delay_mechanism P2P
in both the master and the slave.While you add these settings to the config file you will probably notice which one was set incorrectly.
Done! Things got working. If I need to add a slave in swp3
I have to run the below command in my slave PC connecting the intel card to swp3. Am I right?
ptp4l -i <IFNAME> -m tx_timestamp_timeout 10 -2 -s -P
No, if you want to add further slave devices to the switch, then the clock_type
is no longer an OC (ordinary clock), but either a BC (Boundary Clock) or a P2P_TC (Transparent Clock), because you want the switch not only to synchronize to the master*, but also relay time to the other slaves.
And you don't start another ptp4l instance, you just specify multiple interfaces (-i swp2 -i swp3
).
So look at the variety of predefined linuxptp configs, and pick your assortment.
*When in P2P_TC mode, the transparent clock does not synchronize to the master unless free_running
is 0.
@vladimiroltean TSN requires switches that are 802.1AS compliant. Is it possible to time sync using 802.1AS?
I don't really understand 802.1AS, but I think it is a misconception that TSN requires it. TSN requires synchronized clocks across the network. This is so that switches may enforce time-based admission control and scheduling for offloaded traffic. And 1588 does the job just fine for that. Anyway, as far as my 802.1AS understanding goes, the synchronization is only 'logical', since all devices timestamp based on free-running clocks, and correct those based on cumulativeScaledRateOffset from the follow up information TLV that is present in the PTP messages. But since all PTP hardware clocks are fundamentally still free-running, how would 802.1Qbv/802.1Qci still work? I think the goals are conflicting. Please change my mind.
Two major point:
Considering the above point and 802.1AS has the advanced profile when compared to the IEEE1588 you cannot possibly connect the IEEE1588 and 802.1AS devices together. There should be bridge to connect these both in a same network, basically a special hardware.
TSN standard is 802.1(AS,QBV, Qbu etc..,) So a TSN device should support 802.1AS
Point #1 is circular and does not bring in fact any argument: you need to support 802.1AS because others support 802.1AS too. But not why 802.1AS itself would be better. Point #2 does not explain what the benefits of 802.1AS synchronization algorithms are. Furthermore, my current understanding of them tells me that they can not possibly inter-operate with the goals of 802.1Qbv. If you can explain how the 2 can be reconciled, and whether an 802.1AS bridge with a hardware-synchronized PTP clock can be built (which can be used to trigger gate open events for Qbv), I'm all ears.
@vladimiroltean By following the steps above, I'm able to see the ptp4l time synchronization. But it's not stable. The linuxptp service has been customized for the switch to operate as a P2P_TC. So other PC (PTP slaves) are connected to switch. After 20 mins in the linuxptp log file I'm able to see "tc failed to forward message in port 1".
Is there an associated kernel log to go along with this error? Or are the error messages restricted to ptp4l? Could you share the exact configuration file?
@vladimiroltean There is no kernel log with respective to this.
/etc/linuxptp.cfg
[global]
slaveOnly 1
delay_mechanism P2P
network_transport L2
tx_timestamp_timeout 20
clock_type P2P_TC
utc_offset 36
[swp2]
[swp3]
[swp4]
When running ptp4l command instead of service, I'm able to see the 'port2: the link is down' but the interfaces are up and master is running. I don't understand the issue here. Could you please help me solving this?
There is no kernel log with respective to this.
Ok, so it is an application-level issue.
So you are running ptp4l on swp2, swp3 and swp4, but the link is down on one of the interfaces.
You need to think about what a transparent clock does. It receives sync frames on one port, and forwards them on all other ports, since it has no way of knowing which ports have slaves interested in those frames and which don't.
In effect, that means it will attempt to send frames even over interfaces that are down.
I see your configuration file is missing tc_spanning_tree 1
, which in effect starts keeping track of the topology and should avoid sending frames on interfaces it does not need to. Here is the explanation with which this setting was added:
commit e6af4608c4d672490398a8cbcb17b8ee5033c191
Author: Richard Cochran <richardcochran@gmail.com>
Date: Mon Apr 16 16:20:06 2018 -0700
config: Add a configuration option for preventing loops in TC mode.
According to 1588, PTP message loops are simply someone else's problem
with respect to transparent clocks. Since we are running the BMCA for
syntonization anyway, we might as well go ahead and implement the spanning
tree for PTP messages.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
So I guess you need to enable the tc_spanning_tree
setting, and then figure out why your link is down.
To identify the problem I simplified the architecture, I connected ptp4l master PC interface to swp2(slave) in switch and started synchronization(no other slaves are connected in network 1 slave LSA1021ATSN and 1 master). After few mins I'm able to see "rouge peer delay response" log in master PC terminal and same in the switch as given below.
A few days ago there was this discussion on linuxptp-users related to rogue peer delay responses. I wonder whether the issue is the same.
@vladimiroltean Do you have any idea on "rogue peer delay responses" on master PC?
@vladimiroltean the rouge guy error is fixed. Now, I'm not able to sync all the slave PC's. But switch is working as a slave and synchronization is done. the P2P_TC clock type is added in the switch but the slaves PC log says "selected best master clock 6805ca.fffe.8cfd92".
slave PC command: ptp4l -i
synchronization is done
Synchronization is never "done", it is a continuous process. You mean that the switch prints selected best master clock 6805ca.fffe.8cfd92
and then stops? I have seen that behavior before, but right now I can't seem to be able to reproduce it. I've been running the P2P_TC on a switch for a couple of hours now and it still works:
Feb 15 12:30:01 OpenIL ptp4l[548]: [94686.287] rms 13 max 18 freq -19744 +/- 11 delay 745 +/- 1
Feb 15 12:30:03 OpenIL ptp4l[548]: [94688.287] rms 3 max 4 freq -19740 +/- 1 delay 746 +/- 0
What are your linuxptp endpoint (master and slave) settings and software version? Mine are:
[global]
#
# Default Data Set
#
slaveOnly 0
socket_priority 0
#
# Run time options
#
tx_timestamp_timeout 10
#
# Servo Options
#
step_threshold 0.00002
first_step_threshold 0.00002
#
# Default interface options
#
clock_type OC
network_transport L2
delay_mechanism P2P
#
# Clock description
#
productDescription ;;
revisionData ;;
manufacturerIdentity 00:00:00
userDescription ;
timeSource 0xA0
Does the synchronization on the switch stop spontaneously, or is it anything in particular that triggers it? Does it stop when no slaves are connected? 1 slave? 2 slaves? Can you collect another ptp4l log when it stops, but with "-l 7"?
I managed to reproduce it after all. Looks like increasing the logSyncInterval makes it reproduce faster. The ptp4l process appears to completely freeze, although I don't understand why yet. I will come back with a conclusion after some debugging.
Hello,
Currently we are using LS1021ATSN switch + with openIL , we require steps to run NTP where internet is connected to the non-switched TSN ports (eth0 and eth1) and sync the global NTP time with the hardware time(including the switched TSN ports eth 2 to eth 5).
We also require support to run the RTC clock, so that we can get current/latest time during every boot up. I understand the RTC clock is not available in the LS1021ATSN switch, now what is the work around instead using external I2C hardware.
Is there any openIL releases planned on this RTC fix in the future, if that is the case when can we expect?
Regards, Norman