Open dsseng opened 9 months ago
Also, what precision is generally to be expected from statime in its current state with various configurations? Will it be fine to prefer it to ptp4l with my accuracy requirements due to it being easier for me to understand and debug?
Maybe even when it errors out it's still quite good?
I tried to set timestamping to SoftwareAll, but it gets out of sync and crashes in less than a minute (being more stable and getting stabilized at ± when running with timestamp errors). Stats like this in defaults from master:
[08:46:07.7996690.273284912][statime::filters::kalman][INFO] Estimated offset -0.034686801503121804ns+-1174133.2582486677ns, freq 0.000014357311022746628+-99.99998660531863, delay 0.08377435839334342+-999998.6810881766
[08:46:07.9433162.212371826][statime::filters::kalman][INFO] Estimated offset 31125.072865740138ns+-33.30931256549028ns, freq -8.347503900820003+-0.01706983217142415, delay 2652.459423229703+-18.572105366843157
[08:46:08.1949384.2124938965][statime::filters::kalman][INFO] Estimated offset 35319.14693647293ns+-30.46172644472662ns, freq -12.722520424699248+-0.01637229275303176, delay 2622.005839475987+-17.6204027209198
[08:46:08.4462125.301361084][statime::filters::kalman][INFO] Estimated offset 37199.60930336006ns+-28.93209772278724ns, freq -15.406553620600468+-0.016081510426866886, delay 2614.8277874665637+-16.87171478440738
[08:46:08.6974227.428436279][statime::filters::kalman][INFO] Estimated offset 38456.04984441533ns+-27.79845833470226ns, freq -16.434162762233775+-0.015902475335899325, delay 2612.482328986073+-16.19980214007677
[08:46:08.9490563.869476318][statime::filters::kalman][INFO] Estimated offset 39277.94933217492ns+-26.956992945928928ns, freq -17.10663322403466+-0.01578921943725152, delay 2603.2635565054597+-15.59409186058029
[08:46:09.189874.17221069336][statime::filters::kalman][INFO] Estimated offset -0.04279906198298184ns+-1242381.6323622076ns, freq 0.000015946068474992853+-99.99998042108176, delay 0.10272796345380929+-999998.5560895153
[08:46:09.2006692.886352539][statime::filters::kalman][INFO] Estimated offset 39625.28804951445ns+-26.55766886997467ns, freq -17.606733314029366+-0.01576851525203414, delay 2604.129132829317+-15.087655523762544
[08:46:09.4514043.33114624][statime::filters::kalman][INFO] Estimated offset 40100.919945191185ns+-26.273022793179198ns, freq -17.675757158618243+-0.015746869519127113, delay 2601.302756383717+-14.626244054985387
[08:46:09.7031013.965606689][statime::filters::kalman][INFO] Estimated offset 39234.90804258418ns+-26.79883690400627ns, freq -18.381828499486762+-0.01591380572788276, delay 2608.5678909653243+-14.315725044408744
[08:46:09.9547462.463378906][statime::filters::kalman][INFO] Estimated offset 38936.73149882662ns+-27.251524435156938ns, freq -17.751206751849146+-0.016010494644123612, delay 2606.1132558251984+-14.020195374273246
[08:46:10.2053232.192993164][statime::filters::kalman][INFO] Estimated offset 39081.18349694967ns+-27.63778482919722ns, freq -17.445228267971217+-0.01605757128135808, delay 2603.514753704348+-13.74229023899817
[08:46:10.4563698.768615723][statime::filters::kalman][INFO] Estimated offset 39482.07682663397ns+-27.97752681301813ns, freq -17.409495251405186+-0.016074048210812774, delay 2601.6457526739846+-13.481496663727741
[08:46:10.7074570.655822754][statime::filters::kalman][INFO] Estimated offset 39800.574048241ns+-28.34220687328146ns, freq -17.634422797973908+-0.016090251232127005, delay 2604.8227520766286+-13.244131200912637
[08:46:10.8247096.538543701][statime::filters::kalman][INFO] Estimated offset -0.17059241293364785ns+-1357317.4056869515ns, freq 0.00001310906546093925+-99.99997079489602, delay 0.19305953167073178+-999998.4310910511
[08:46:10.9581725.597381592][statime::filters::kalman][INFO] Estimated offset 40179.555609666924ns+-28.63202707599666ns, freq -17.772212248102566+-0.01608230795984942, delay 2603.9807257492153+-13.018996402052307
[08:46:11.2097601.890563965][statime::filters::kalman][INFO] Estimated offset 40532.77912518289ns+-28.83482785676733ns, freq -17.96395841695188+-0.01605347494267495, delay 2604.1203170740555+-12.802422446991011
[08:46:11.4605128.765106201][statime::filters::kalman][INFO] Estimated offset 40910.36529791917ns+-28.97642126822825ns, freq -18.13788138081851+-0.016017718214112867, delay 2604.7474441976096+-12.596418339402101
[08:46:11.6728110.313415527][statime::filters::kalman][INFO] Estimated offset -0.285371224893281ns+-1416044.0008864957ns, freq 0.00007594711735546448+-99.99995865419264, delay 0.28630845293865964+-999998.3060927796
After some minutes it'll fail like that (overadjusting the frequency by more than 1e9 ppm)
Hmm, since it is a kernel bug, you should perhaps just try configuring statime without hardware timestamping. In that scenario only the system clock will be synchronized, not the PHC but that should usually be sufficient, espescially if you only need +-0.5ms. Software timestamping should also be plenty for that level of precision. This can be most easily enabled by simply removing the hardware-clock line from your configuration.
Not sure what you did with the SoftwareAll experiment, but if you modified the code to pass that timestamping mode to the network layer, now a different clock is used for timestamps vs adjustment and there is a large number of ways that could go sideways.
I am going to close the issue here as there seems to be no underlying bug in statime at the moment.
My suggestion was to sync PHC, but use software timestamps. I don't want to sync the system clock because network clocks are frequently out of sync with the real astronomic time. Some of devices we use reset to 1970 and start counting from the epoch zero each time they reset. I certainly want to only have that time on the PHC I use for clocking audio streams.
Another option could be auto determining that offset and syncing the system clock with an offset. Let me illustrate: System time is 2024-01-20 14:00:00, PTP GM clock is 1970-01-20 14:00:00. When starting (well, if you approve the feature I'll of course put it behind a CLI flag) we determine the offset and start driving the system clock with an offset which could be somehow announced to the time consumers. This way we can sync to a clock not having the correct time without PHC and still have browsers working
@dsseng your use case looks similar to mine: #389
A general solution I can think of is to make clock synchronization configurable - add a configuration flag whether we want system TAI clock to be synchronized to the network. It could have these values:
on
- synchronization working as it is nowoff
- clock_task
would not be run. Packets' software timestamps would need to be shifted and scaled to account for TAI-NIC difference.virtual
- synchronize a virtual clock, implemented in Statime, phase locked to TAI, essentially emulating a /dev/ptp
device for systems that don't have one. Timestamps from packets would need to be shifted and scaled to match that virtual clock.Apologies, I misunderstood at least part of your issue. Software timestamps for PHC clocks are really not a good option, as the kernel doesn't support this, it only does software timestamping with the system clock.
We could fall back to userspace timestamping, but that can be problematic depending on hardware, as taking timestamps from a PHC can take a significant amount of time (~ms is not unheard of) which combined with the user space jitter would pretty much destroy any precision.
I think allowing disabling of the internal synchronization mechanisms could be a useful feature, although that should come with a warning in the documentation that at that point the user is responsible for ensuring everything uses the same clock.
Virtual clocks are a nice idea in theory, but currently just not supported by linux in a fashion suitable for something like statime. Hardware clock virtualization is a thing, but that still has all the problems of hardware clocks and those being quite "far away" in terms of latency. If a similar mechanism were to be introduced for the system clocks that could be interesting, but right now there is just no good way of getting the time from the statime process to user processes. As also said in #389 this problem of course disappears when using statime as a library, since then this can be done internally in the using process.
I'm reopening this to keep the discussion more visible and to keep the allow disabling of the clock_task mechanism visible in the issue tracker.
I see. Because SW is essentially "timestamped by kernel mode driver" we can't (yet) have it synced to arbitrary value.
Maybe with a bit of help from the kernel side (well I asked for some of that to allow me to expose PHCs to unprivileged processes already) we could be able to create kernel-virtualized clocks enabling separate clock for software timestamping and freely adjustable by the owner/creator process. We could propose an UAPI considering new SO_TIMESTAMPING options to select such a custom clock for timestamping and something to create/query/destroy virtual clocks from the userspace. These should behave similar to PHCs and use frequency adjustments over a stable CLOCK_MONOTONIC)RAW or something like this.
off - clock_task would not be run. Packets' software timestamps would need to be shifted and scaled to account for TAI-NIC difference.
This seems hard to implement due to the fact we can't (or can we) influence how kernel timestamps the packet because it uses CLOCK_REALTIME (perhaps) as a timestamp value before sending the socket buffer out to the NIC. Would be happy to be wrong on this one however, as this would open up a possibility for userspace-only vclocks which could be accessible using shared memory or sth like this with high enough precision.
@teowoz well, I'm quite close to what you do since I work on AES67 stack in PipeWire, mainly to interact with Dante and RAVENNA devices. Those use PTPv2 however. Also I'd strongly prefer not to have PTP implementation (requiring cap_net_bind_port) in the PipeWire process. Not sure what privilege model you use
Would be happy to be wrong on this one however, as this would open up a possibility for userspace-only vclocks which could be accessible using shared memory or sth like this with high enough precision.
That's what I'm thinking about.
Kernel is timestamping packets using CLOCK_REALTIME.
We can shift and scale the timestamps to make them correspond to our virtual clock.
Virtual clock state could be shared with other processes using SHM. It would consist of at least 2 values as in linear equation: y = ax + b, where: y = timestamp of our virtual clock, x = current CLOCK_TAI, a, b - factors computed by clock synchronization daemon. Or it could be a higher order polynomial for better smoothing. This way, PipeWire side running without elevated privileges, could have read access to our virtual clock. But that's off topic and belongs to #389 more.
Kernel-level virtual clock, exposing /dev/ptp API and making it possible to share a clock between userspace processes, could be probably done by writing a kernel module.
When I enable PTP hardware clock in config on RPi 5, I get
[07:43:11.7057039.737701416][statime][ERROR] Missing recv timestamp
(due to https://github.com/raspberrypi/linux/issues/5904). However I can see the PHC frequency gets adjusted.How can I adjust PHC (
/dev/ptp0
) while only using software timestamps as they work? Or may I just ignore those errors and it'll still work? I need precision around 0.5-1 ms.Thanks!