DRIFTsync / driftsync

DRIFTsync - A lightweight and dependency free time synchronization library with implementations in various languages and frameworks.
https://driftsync.org
MIT License
9 stars 3 forks source link

Timestamp differs on localhost and remote client #2

Open leavittx opened 2 years ago

leavittx commented 2 years ago

Running driftsync server locally, and two driftsync C# clients: one on the same machine as the server and another one on a remote LAN machine. I'm having a noticable difference in the global timestamp value on these two for some reason - about 40-50ms at least.

Please see this video: https://www.dropbox.com/s/z5578xmi9nyzabz/DriftSync_LocalAndRemote.MOV?dl=0 (the left one is the server)

I'm using SECONDS scale. The time being sent from the server is starting from 0. Using simple code like that: packet.remote = localTime() - start_time; where uint64_t start_time = localTime(); before the server loop start. Do you have any ideas on why that's happening?

mmlr commented 2 years ago

Hi there

I'm not entirely sure I understand what you mean by your code sample and "time being sent from the server". Usually you would connect multiple DRIFTsync clients to the same DRIFTsync server and then use sync.globalTime() on all of them and that global time should be the same across all of the clients. The localTime is really the local clock and is not kept in sync. You can use the localTime as a reasonably high resolution clock source for relative timing, or it can be used to calculate offsets between the globalTime and the localTime where this is useful. Or do you simply mean you adjusted the server code to start from 0 by initially caching the localTime and subtracting it on each packet?

In case your visualization indeed shows the global time (i.e. from sync.globalTime()), then it should definitely be closer. Can you check what the median round trip time is for both clients? Is it unexpectedly high, possibly leading to less accuracy? In a wired LAN I'd expect it to be around 1ms.

Something to watch out for: The display output path of devices can be quite different and therefore cause different display latency. This can happen due to different configuration, i.e. only one side using double buffering or the like, or it can happen due to the display hardware. If it is indeed the output path, it is usually a fixed and constant difference that you would correct for with a simple offset configured per device. 40-50ms is somewhat high to be caused by this however.

Audio can be a more accurate mean of checking for sync, as audio latency is often known and can therefore be accounted for. Measuring it can be done by producing beeps from multiple devices, recording them and then inspecting the recording closely.

leavittx commented 2 years ago

Hi there Thanks for your answer!

Or do you simply mean you adjusted the server code to start from 0 by initially caching the localTime and subtracting it on each packet?

Exactly! I did a little modification on the code, as I tend to like smaller values for debugging.

In case your visualization indeed shows the global time (i.e. from sync.globalTime()), then it should definitely be closer.

That's what is being displayed.

Can you check what the median round trip time is for both clients? Is it unexpectedly high, possibly leading to less accuracy? In a wired LAN I'd expect it to be around 1ms.

Sure. Measured on the client running on the same machine as the driftsync server sometimes it's 0 (second video).

Please take a look at the video below. I've added more stats / debugging info. https://www.dropbox.com/s/6eykbpbka7g61q8/driftsync_upd5000.MOV?dl=0

We can talk about ~15ms difference in the video above, I think. But now the remote client is ahead.

Also tried a smaller update interval: https://www.dropbox.com/s/hd8i4631zsprk63/driftsync_upd50.MOV?dl=0

Running two clients on the same machine with driftsync server, if that makes sense: https://www.dropbox.com/s/twybrl2ylcz5yho/2clients_local.MOV?dl=0 Interesting - we don't get perfect sync even in that case

Something to watch out for: The display output path of devices can be quite different and therefore cause different display latency. This can happen due to different configuration, i.e. only one side using double buffering or the like, or it can happen due to the display hardware. If it is indeed the output path, it is usually a fixed and constant difference that you would correct for with a simple offset configured per device. 40-50ms is somewhat high to be caused by this however.

The difference in globalTime is varying from one test run to another in my case (most likely independent of the update interval). Probably not related to the display latency (or at least its varying part)

Info in smaller font is related to my own simpler syncing mechanism, which is currently disabled.

leavittx commented 2 years ago

Hi! Some more results. Same as before left computer is running driftsync server and also a client instance.

  1. Showing globalTime with python client https://www.dropbox.com/s/q5pevrwl6svdv7m/driftsync_python.MOV?dl=0 The difference is changing from 5-30 ms through time. No constant difference which could be caused by the presentation time difference.

  2. Running C# client on left computer, and C# + C client (under WSL) on the right one. The medianRoundTripTime on right is 1-2ms (not visible on the video) https://www.dropbox.com/s/b96ew33mg4fzr37/driftsync_unity_vs_c_2_machines.MOV?dl=0 The difference is also not constant, but is being kept in same range more or less. Right WSL c client: +10+-5ms relative to the Left C# client Right C# client: - 20+-10ms relative to the Left C# client

  3. Running WSL c client VS C# client on the left computer showed that WSL c is being significally left behind for some reason. https://www.dropbox.com/s/uagynb8m4qkglwr/driftsync_unity_vs_c.MOV?dl=0 But that's not a real problem - most interesting cases are 1-2 and the difference is indeed changing through time

Note: I've managed to compile the server without the WSL subsystem, just replaced unix sockets with WinSock, so it's running natively on windows

leavittx commented 2 years ago

Just for a sanity check, I've tested my current syncing implementation on the same setup today (bottom - server, top - client): https://www.dropbox.com/s/v944k3tbh83ldzx/UDP.MOV?dl=0 It's just a UDP server which sends current timestamp 120 times a second to all the clients. But it looks quite precise. It's not continious, so it's difficult to use it for smooth playback (more frequent updates lead to jittering anyway), that's why I use that value just for correcting the playback position in case of significant differences between local and remote clock on the client.

So I guess I'm missing something with driftsync. Any tips?

mmlr commented 2 years ago

Hi there

Thank you very much for the many test cases, this is indeed very interesting.

Unfortunately the video compression makes the numbers somewhat difficult to read in the Unity test cases, so it is harder to compare the numbers over a period of time. The python video is very crisp though and I've stepped the video and wrote out the numbers. The resulting graph is interesting and shows that the local clock apparently runs differently across both machines:

image

The left machine always does roughly steps of 16 ms, but every ~14 steps it jumps by about 31 ms. The right machine on the other hand updates in steps of around 18 ms and then drops to one step of 9 ms every ~9 steps. The time period of this sampling is small enough that the actual sync process should not be involved at all (unless you reduced the interval), which lets me conclude that the error has to come from the accuracy of the localTime value.

I can't really explain why the two machines apparently run at different internal intervals, but this might be inherent to the clock source being used. The accuracy of sleeping under Windows is generally relatively bad however (i.e. the wakeups happen at irregular intervals) and therefore what's visible here might just be an effect of the sleep loop in the Python client not waking up as regularly as desired. It might also be related to power save features, can you check if both machines are set to the same power profile, and ideally set both to performance?

The good news is that the two machines stay within bounds that would be somewhat expected given the above and average at an offset of 5 ms. So at the very least, the sync itself is within a frame.

I have no idea why the WSL C client is off by 1.1s, that doesn't make sense to me at all. Is this offset constant or does it vary wildly?

The last video you uploaded with your own reference seems to be gone, but I'd be interested in seeing this.

As an additional test, can you try an external reference server? There is a public server hosted at driftsync.org that you can work with. This should take out the difference from one machine using a local server. The public server also runs under Linux, which may or may not be relevant. If the local clock source under Windows indeed jitters significantly, using it as a server may be problematic. Generally the low update frequency and length of integration time should make this factor insignificant however.

mmlr commented 2 years ago

Another thing that just came to mind: The Unity rendering is pipelined by default and the various frame *Update hooks will sometimes be called with jitter resulting from that. At low refresh rates this may introduce quite a bit of variation. Also, the clock reference for localTime that DRIFTsync uses is not tied into the render timing of Unity and therefore isn't corrected for some of that variation. This blog post is quite revealing of what is going on under the hood and how this may interact here:

https://blog.unity.com/technology/fixing-time-deltatime-in-unity-2020-2-for-smoother-gameplay-what-did-it-take

Maybe you can try lowering QualitySettings.maxQueuedFrames to 1 so that pipelinging is effectively disabled and then see if this improves things? Also try using VSync in that combination as it should further stabilize the timing of the hooks being called.

Which of the *Update hooks do you currently use for updating the timestamp?

leavittx commented 2 years ago

Hi!

Wow, nice analysis! I'm not fully understanding it yet, but I will try to read again, while I'm doing more tests. I tend to believe that clock under windows can be not as precise as under linux. However it's "drifting" a little bit too much in my opinion even for a not 100% ideal clock implementation :P

Unfortunately the video compression makes the numbers somewhat difficult to read

It's most likely dropbox-introduced artifacts. Will post links to download videos. They will have better quality compared to dropbox player.

The last video you uploaded with your own reference seems to be gone, but I'd be interested in seeing this.

Sorry for providing an invalid link. Here is a new one: https://www.dropbox.com/s/bu5g2nb9lctiyjz/UDP.mov?dl=1

can you check if both machines are set to the same power profile, and ideally set both to performance?

I got my second desktop comuter from service yesterday, so I can run 2 remote clients now. Both are running on high performance power profile.

  1. Connected to windows-based LAN driftsync server: https://www.dropbox.com/s/es27cdobsdkpugc/driftsync_2remotes_win_srv.MOV?dl=1 Still poor syncing, even when both clients are remote.
  2. Public driftsync.org server: https://www.dropbox.com/s/loat6sx71k2rb8z/driftsync_2remotes_public_srv.MOV?dl=1 I would call that somewhat better syncing, although it's still far from ideal

I have no idea why the WSL C client is off by 1.1s, that doesn't make sense to me at all. Is this offset constant or does it vary wildly?

The offset is varying in approximately 0.5-1.2s range which is kinda crazy for me. Should be some WSL specific thing :\

I will see if I can make some tests using Linux next days, finally, just to validate that I get perfect sync in same environment driftsync was developed on (with driftsync.org, probably), if that makes sense (I assume nobody tried it on windows before).

Let me know if you see anything important in the results I've posted above!

leavittx commented 2 years ago

Hi there,

Posting fresh Ubuntu install tests results below. Please take a look. Connecting to driftsync.org using the default update interval.

  1. Python client [code] https://www.dropbox.com/s/5a9bmgjbi27418i/driftsync_ubuntu_python.MOV?dl=1
  2. C client [code] https://www.dropbox.com/s/p20iabnkpj25r5i/driftsync_ubuntu_c.MOV?dl=1

I didn't see your second comment on the update hooks 🤦‍♂️

Which of the *Update hooks do you currently use for updating the timestamp?

If reading the question correctly: currently I'm using Update() method (hook) to get the sync.globalTime and update the video frame index accordingly. And then in OnGUI() method (hook) on the "same Unity frame" I'm getting it again to draw it using Unity IMGUI (the one which is being displayed in green). I have VSync enabled, and gonna try to disable pipelinging tommorrow.

Cheers

mmlr commented 2 years ago

Hi

I've investigated this some more and come to an interesting conclusion: It is probably indeed a drift on the display side. I can reproduce the apparent issue in a similar way. It does not stem from an actual synchronization issue however. To verify this, I've written a small tool that takes in timestamps from arbitrary sources and compares them:

https://gist.github.com/mmlr/0610f53dc51b5c9f3503b9f63327e234

It can be used to run a local and a remote DRIFTsync client in stream mode and verify the sync quality (it can also be fed from multiple local clients, for example in different implementations or fed data simply from a UDP socket via netcat). It corrects away the difference in arrival time of timestamps from all its sources but otherwise does nothing else than compare them. Obviously this only works when the network latency is low and stable.

On my network, this gives results in the low one digit milliseconds, most of that certainly stemming from spurious network latency jitter. This is stable across various sources, multiple local clients, local vs. remote machine via SSH or UDP, even local client vs. one running on an Android phone.

The display of the same data across two machines however exhibits a similar pattern to what we saw earlier. Looking at the boundaries of the jitter pattern, it is indeed bounded by one frame time and follows the two fundamental frequencies and their spikes.

I also have a theory as for why you don't see it in your reference video with sending a timestamp out via UDP. This is based on the assumption, that you are sending out a single timestamp over the course of a frame, i.e. you send it multiple times but always with the same value and that value only changes once every frame of the server. If this is indeed how it works, it is easy to explain why you see the same value on both sides: there is only one value to begin with, the one from the server. This is in contrast to DRIFTsync where the timestamp is generated locally (albeit aligned with the global reference) and there indeed being two independent values.

DRIFTsync simply shows the actual difference between the two sides while the UDP timestamp doesn't. In the UDP case, one can still see this difference in the video, but it's shown by the difference of when the value becomes visible on each display instead of by the value itself. It can clearly be seen, that the values don't actually change at the same time on both displays, for example in the case of the value 0,133 seen 3.17 seconds into the video where the client actually shows 0,013 first for a couple of video frames until the server catches up. After that they appear somewhat in sync for a couple of values but then the 0,300 is again slightly quicker to complete on the client for example. Even at 4x or 8x slowdown, this variation is tiny, so it's not that obvious as actually seeing different values.

As for what this actually means: There is a difference in resolution between the two methods. If you were to reduce the DRIFTsync resolution, i.e. by taking the values modulo your frametime, the values should start to match up again. Conversely, if you started to send sub-frame timestamps out via UDP (for example by using the tick counter as a source), you should see the same difference in values showing up between the displays.

In general, the time synchronization is actually the easy part, the hard part being applying this timestamp consistently to something frame based. That is also the reason why DRIFTsync generally operates "slowly" by integrating differences over a larger window of time to make the small variances insignificant. It is also the reason why it provides the suggestPlaybackRate API and not a frame based one. Aligning the initial timestamp and then keeping the playback rates the same between the local and global clock is simpler and more accurate than to try to keep them in sync frame by frame. But it does mean that your clients won't necessarily drop/repeat the same frames to stay in sync.

leavittx commented 2 years ago

Hi!

I also have a theory as for why you don't see it in your reference video with sending a timestamp out via UDP. This is based on the assumption, that you are sending out a single timestamp over the course of a frame, i.e. you send it multiple times but always with the same value and that value only changes once every frame of the server.

That's right! I also think I can understand the analysis you've done and the tool does the verification in a correct way, unlike my test setup. I'll give it a try to see the actual network jitter in my case. Thanks for all the work! Really appreciate it!

Just to finish the "incorrect" tests and move forward, I'm posting couple more videos. After reading the article you posted I've switched to Unity 2020, as it seems to have some fixes for the main game thread and deltaTime. The video being played is now 60fps instead of the old 30fps one.

  1. Max Queued Frames = 1 https://1drv.ms/v/s!AhRsdPdtJl1uhpYVKML6tZho1p8z3A?e=gTh4zj
  2. Max Queued Frames = 2 https://1drv.ms/v/s!AhRsdPdtJl1uhpYWp6UsVynwkFhDSA?e=xfygwP

I don't see why one is better than another here. But it's nice to have the latency control and it might be useful in future when using the suggestPlaybackRate approach. Btw: even the frame based sync looks really good here.