Open jclark opened 2 years ago
The kernel gets into a state where it delivers 4 extts events in a second:
Nov 7 20:09:45 ricotta ts2phc: [1911220.240] eth0 extts index 0 at 1667826586.175084690 corr 0 src 1667826622.93314787 diff -35824915310
Nov 7 20:09:45 ricotta ts2phc: [1911220.240] eth0 master offset -35824915310 s2 freq -100000000
Nov 7 20:09:45 ricotta ts2phc: [1911220.492] eth0 extts index 0 at 1667826586.175084690 corr 0 src 1667826622.345345651 diff -35824915310
Nov 7 20:09:45 ricotta ts2phc: [1911220.492] eth0 master offset -35824915310 s2 freq -100000000
Nov 7 20:09:45 ricotta ts2phc: [1911220.744] eth0 extts index 0 at 1667826586.175084690 corr 0 src 1667826623.597359645 diff -36824915310
Nov 7 20:09:45 ricotta ts2phc: [1911220.744] eth0 master offset -36824915310 s2 freq -100000000
Nov 7 20:09:45 ricotta ts2phc: [1911220.996] eth0 extts index 0 at 1667826586.175084690 corr 0 src 1667826623.849334398 diff -36824915310
Nov 7 20:09:45 ricotta ts2phc: [1911220.996] eth0 master offset -36824915310 s2 freq -100000000
This not surprisingly confuses ts2phc.
My hypotheses is that what is happening is that timestamps that were explicitly requested in order to read the hardware clock are being confused with extts timestamps. The driver schedules work 4 times a second to check whether there is an extts timestamp to report. It finds one every time, because there are timestamps that were explicitly requested but were leftover because didn't come in time (because of no carrier).
It should be possible to use bpftrace to verify this hypothesis.
This is exacerbated by the fact that the default value for step_threshold
is 0, which means that ts2phc won't step the clock to correct the bad time.
We can alleviate this by having something like step_threshold 0.9
, which allow ts2phc to recover quickly if something goes badly wrong.
If chrony with a PHC refclock and ts2phc are both running and the cable is unplugged for a time, the PHC will end up with the wrong time (many seconds out).
I have observed this with
ts2phc -s generic
. I am not sure if it happens withts2phc -s nmea
.Attached is log showing what happens.
ptp-unplug.log