nxp-archive / openil_linuxptp

PTP IEEE 1588 stack for Linux
GNU General Public License v2.0
136 stars 60 forks source link

Master offset is big in the slave during the start #6

Closed kingalexsylvester closed 5 years ago

kingalexsylvester commented 5 years ago

My master offset is sometimes very big: ptp4l[741.578]: master offset 375396363 s2 freq +1000000 path delay -17200861

and the nodes take more time to come in the nanosecond range with respect to the master clock. Is it possible to increase the servo steering speed by altering the configuration parameters so that the above number comes in the range of +/-10 ns within few seconds.

Use Case:

In a factory plant there are 8 devices and they are connected using 3 TSN switches. When the system boots up the synchronization should happen in a matter of 3-5 seconds so that the plant comes to operational state in <7s(hard requirement).

Our nodes and switches are not connected to NTP and does not has a RTC, so sometimes one or many of the nodes has a very huge master offset and it takes more time to come to sync.

This item is killing our product release and we need immediate help here.

vladimiroltean commented 5 years ago

Hi there, What board and Ethernet ports (i.e. driver) are you using? Normally this is fixed by setting "step_threshold" to 0.00002 or lower in the ptp4l config file. You should also note that linuxptp in general does not behave well if the PHC ticks around Jan 1st 1970 (which it does when it gets out of reset). You might want to do something like "phc_ctl /dev/ptp0 set" before starting the ptp4l service, so that it has an initial time based on the RTC that is "in the ballpark".

kingalexsylvester commented 5 years ago

Hi,

Thanks for the quick reply

The boards and switches used in my project are confidential. Also my setup has both hw and sw timestamp devices.

I asked this query to make sure that, a rogue entering the network and taking the GM status to share it's clock to the entire network which might lead to a lethal situation. Our factory plant should not be out of sync for more than 7 seconds failure of which will tear apart our system.

So here are my action points:

  1. I will change the "step_threshold" to 0.00002 and all the devices that are being built in our plant

  2. Introduce a rogue guy by setting it's time to EPOCH and forcing it to become GM. My acceptance criteria will be to see if the entire network is able to come to sync in the nano second range(hw timestamp nodes only) withing 7 seconds

  3. Do a power cycle test for 100 iterations and understand the reliability of ptp4l

Please do let me know if I can alter any other parameters or perform any other regression test which will make the system more reliable.

jagmeethanspal commented 5 years ago

The clock servo might have parameters that trade-off between speed and accuracy, I believe. If you tweak it a way that u start with a faster speed and then dynamically (as you approach eg the half way) move towards higher accuracy and finer adjustments. I do not have exact parameters though, just an idea.

Regards, Jagmeet

On Mon, Aug 5, 2019, 19:17 kingalexsylvester notifications@github.com wrote:

Hi,

Thanks for the quick reply

The boards and switches used in my project are confidential. Also my setup has both hw and sw timestamp devices.

I asked this query to make sure that, a rogue entering the network and taking the GM status to share it's clock to the entire network which might lead to a lethal situation. Our factory plant should not be out of sync for more than 7 seconds failure of which will tear apart our system.

So here are my action points:

1.

I will change the "step_threshold" to 0.00002 and all the devices that are being built in our plant 2.

Introduce a rogue guy by setting it's time to EPOCH and forcing it to become GM. My acceptance criteria will be to see if the entire network is able to come to sync in the nano second range(hw timestamp nodes only) withing 7 seconds 3.

Do a power cycle test for 100 iterations and understand the reliability of ptp4l

Please do let me know if I can alter any other parameters or perform any other regression test which will make the system more reliable.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openil/linuxptp/issues/6?email_source=notifications&email_token=AEIYKMPBAEZ5DP3CETFCQTTQDAVQ5A5CNFSM4IJKOBU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3R3X7I#issuecomment-518241277, or mute the thread https://github.com/notifications/unsubscribe-auth/AEIYKMLICNTXCDTJH2HV3ELQDAVQ5ANCNFSM4IJKOBUQ .

kingalexsylvester commented 5 years ago

@jagmeethanspal : Could you please brief me about the parameters and how to use it to tame my system?

jagmeethanspal commented 5 years ago

Hi,

Not exactly sure, but if you can tune/play with Loop Bandwidth / Gain in the locking-algorithm/servo:

Regards, Jagmeet

On Tue, Aug 6, 2019 at 10:58 AM kingalexsylvester notifications@github.com wrote:

@jagmeethanspal https://github.com/jagmeethanspal : Could you please brief me about the parameters and how to use it to tame my system?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openil/linuxptp/issues/6?email_source=notifications&email_token=AEIYKMOLUCQUBFUVDE4DLE3QDEDW3A5CNFSM4IJKOBU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3T4VBA#issuecomment-518507140, or mute the thread https://github.com/notifications/unsubscribe-auth/AEIYKMPIE55MJQTDGK6JGA3QDEDW3ANCNFSM4IJKOBUQ .

--

Best Regards, ~ Jagmeet Singh Hanspal ~

kingalexsylvester commented 5 years ago

@jagmeethanspal : I understand ptp4l has static configuration and we cannot change the configuration parameters on the run. So if I am changing the Loop Bandwidth / Gain and it is going to stay the same for the entire run and I don't think ptp4l provides a method to dynamically alter the speed. Correct me if I am wrong.

kingalexsylvester commented 5 years ago

Hello Everyone,

I found this to be a generic question of linuxptp, so I am closing this query at openIL.

jagmeethanspal commented 5 years ago

Just inquisitive, did the following options helped out in convergence: