A better synced playback example

leavittx commented 2 years ago

Hi there! The example provided (C#) isn't making much sense to me.

double globalTime = sync.globalTime();
double playbackRate = sync.suggestPlaybackRate(globalTime, 0);

although according to the docs, the parameters aresuggestPlaybackRate(globalStartTime, playbackPosition). So, in situation when multiple clients are starting the playback in different time points, but all want to have same position, what would be the usage scenario?

If I do

double suggestedPlaybackRate = sync.suggestPlaybackRate(globalStartTime,  player.position);

where globalStartTime is obtained via sync.globalTime() before the player has started, it's still not synced between clients.

I think using globalStartTime is a bit weird, because basically we have the globalTime, and just want each client's player have same position.

But, anyway. How is it supposed to be used?

mmlr commented 2 years ago

Hi there

The suggestPlaybackRate call in the example is indeed a no-op, partially to show that it exists and partially for coverage testing.

The globalStartTime needs to be coordinated between clients outside of DRIFTsync. Usually there already is a communication channel among clients to convey what should be played by whom. To get synced playback, this communication channel is used to declare the media and at what global time it starts. Then, each client would initially wait for this global time and start playback (or seek close to where it should be if the time already passed) and then continuously use suggestPlaybackRate with the same global start time and the local playback position to adjust for drift while playing.

It is usually a good idea to put the start time somewhat in the future so that each client can prepare any needed media decoding pipeline (aka "preroll"). This avoids delays at start of playback that then have to be corrected for with playback rate adjustments.

In HTML video, you would generally hook up the suggested playback rate to the playbackRate of the video element and use the timeupdate event to call suggestPlaybackRate. This ensures that the local playback position is indeed recent and accurate.

leavittx commented 2 years ago

Finally going this direction. Let me ask some advice:

For sending global start time to clients I need to keep track of globalTime in the server app (to send to clients for the startGlobalTime mainly but also to do the playback on the server). The easiest for me is to include the driftsync server in the Unity app itself, running in a separate thread. However I see that the time value being sent from the server is 13-14ms behind of globalTime. I guess that's how it supposed to work. Should I really avoid relying on that fixed offset, and use an additional driftsync client inside the server app? I don't like that too much, and if possible would get the globalTime/suggestedPlaybackRate from embedded driftsync server thread. 99% sure that the answer is to use another client but just to double check.
Is embedding driftsync server inside an app is a bad idea?
Unfortunately I cannot guarantee that each client will have the preroll time (client may crash for example and has to be restarted (or machine can restart)). In that case I guess I still should be able to use same global start time on the restarted client and get things in sync. However it could happen that the playback itself is happening in a non-linear way, for example first a video is being played from the beginnging for 3 seconds, then another 3 seconds are played at 2X playback rate, then seeking to the middle of the video happens, then video paused for 3 seconds, and then the playback is resumed in a reverse direction (just an example to show possibilities).

I've tried to change the clock in driftsync server based on the playback state, and came to a conclusion that I shouldn't touch it and let it run monotonically - otherwise the whole thing breakes, or I need to reset the driftsync client at least.

Given all that, can you give any tip to support variable playback rate for example? For seeking I guess I can introduce some relative time offset variable being broadcasted to all clients and then "add" it to the globalTime value. Pause is just say client to stop and then resume with a negative offset value depending on how long the pause was. Reverse is low priority for now but needs to be done for the future also.

mmlr commented 2 years ago

Hi

For sending global start time to clients I need to keep track of globalTime in the server app (to send to clients for the startGlobalTime mainly but also to do the playback on the server). The easiest for me is to include the driftsync server in the Unity app itself, running in a separate thread.

Embedding the server should be fine. Eventually the distinction between server and client will probably go away, because they are essentially the same, the server just being more compact. A pure standalone server is handy to have for minimal deployment overhead, but merging both functions allows for servers in all supported languages/frameworks.

However I see that the time value being sent from the server is 13-14ms behind of globalTime. I guess that's how it supposed to work. Should I really avoid relying on that fixed offset, and use an additional driftsync client inside the server app? I don't like that too much, and if possible would get the globalTime/suggestedPlaybackRate from embedded driftsync server thread. 99% sure that the answer is to use another client but just to double check.

Using the client code is simpler, as it is already built for that purpose, but using the server should be fine as well. On the server side it simply means localTime == globalTime. There shouldn't be any offset between the server localTime and the globalTime, or rather there can't be, because they are the same. Are you sure that it isn't an issue with when the comparison happens, i.e. it being a frame late or similar?

Is embedding driftsync server inside an app is a bad idea?

No, it shouldn't matter at all where the server runs, as long as it has access to a reasonably accurate and high resolution clock and reasonable latency for receiving and sending UDP packets.

Note however that this can become a problem when there is large network latency, as that will put your clients at some offset towards the server. Having the server central and having the same network latency to all clients will give the best results. When network latency is negligible (as it most often is in a wired LAN), i.e. around or below 1ms, it won't matter.

Unfortunately I cannot guarantee that each client will have the preroll time (client may crash for example and has to be restarted (or machine can restart)). In that case I guess I still should be able to use same global start time on the restarted client and get things in sync. However it could happen that the playback itself is happening in a non-linear way, for example first a video is being played from the beginnging for 3 seconds, then another 3 seconds are played at 2X playback rate, then seeking to the middle of the video happens, then video paused for 3 seconds, and then the playback is resumed in a reverse direction (just an example to show possibilities).

The preroll time is generally useful when you want to ensure that things start (or change) at an exact moment. You preload everything so that all clients can start playback immediately instead of having to first load files from disk, start decoders and such. For a client that comes "late", that isn't really relevant as it will have missed that common start time anyway. In such cases it is still useful to load everything and seek to the approximate position plus some extra time and pause there, then unpause at the exact time. This avoids the client being initially late and having to catch up due to the overhead of playback start. So a general sequence could look like this:

Client starts and gets the playback information, for example: globalStartTime = 100s, globalTime = 250s, so playback should be at 150s into the media
Client loads media and seeks to desired position + some extra time and pauses there, for example it seeks to 155s instead of 150s
When playback is ready (at 152s for example due to loading overhead), schedule the unpause with the delta of 3s

In the example, this avoids starting out 2s late only to then have to increase playback speed for the following seconds to compensate. The actually needed extra time can be measured a couple of times and then set per client/application so that it most often fits without delaying playback start needlessly.

I've tried to change the clock in driftsync server based on the playback state, and came to a conclusion that I shouldn't touch it and let it run monotonically - otherwise the whole thing breakes, or I need to reset the driftsync client at least.

Definitely don't mess with the globalTime. The global time and the playback time should be separated so that you can always reference any instructions you give to a client to the absolute global time. This also makes debugging a whole lot easier as you always have the same time scale.

Given all that, can you give any tip to support variable playback rate for example? For seeking I guess I can introduce some relative time offset variable being broadcasted to all clients and then "add" it to the globalTime value. Pause is just say client to stop and then resume with a negative offset value depending on how long the pause was. Reverse is low priority for now but needs to be done for the future also.

To do sequences as you described them above, I'd suggest doing exactly what you did in text form and translate it to code, i.e. make a descriptor that records all these steps and send that to the clients ahead of time. The simple globalStartTime that was previously sent to clients then becomes a more complex list of actions. Having the globalTime you can exactly time every event in the sequence ahead of time, which will make it a lot more accurate than trying to do it "live" at every turn.

Generally you could simply operate with a list of tuples in the form of when (the globalTime when this state shall become active), what (the media to be played), where (the position in the media) and how (the playback speed, possibly negative). Such events can then be sent to the clients along the way, always ensuring some comfortable delta between when the event shall happen and the current time so that the client has enough time to arm a timer or otherwise prepare what needs to be done (preloading another media, seeking).

An example sequence could then look like:

Send event at globalTime 95s: play media 1 at globalTime 100s from position 50s at playback rate 0.5
Client prepares media 1 and seeks to 50s so that it is ready to play straight away when globalTime 100 comes around
Send event at globalTime 115s: play media 1 at globalTime 120s from supposed position 60s (50s original + 20s at 0.5 rate) at playback rate 1
Client arms a timer to switch playback rate of the already playing media at exactly globalTime 120s
Send event at globalTime 125s: play media 2 at globalTime 130s from position 10s at playback rate -0.5
Client prepares media 2 and seeks to 10s so that it is ready to play at globalTime 130s
...

If the sync works out, the calculated positions should always match when the corresponding event time comes around. Hence at any event you can calculate a new globalStartTime that you then reference off for playback rate adjustments and just divide by the playback rate conveyed in the event tuple to arrive at the expected playback position. In the example sequence above:

globalStartTime = 100s (event time) - 50s (media position) = 50s, playback rate 0.5
globalStartTime = 120s (event time) - 60s (media position) = 60s, playback rate 1
globalStartTime = 130s (event time) - 10s (media position) = 120s, playback rate -0.5

Then in each of these segments you can use the suggestPlaybackRate API with the caluclated globalStartTime and a playback position calculated from the start position and playback rate. Lets call it the virtual media position and calculate it as:

start media position + (current media position - start media position) / playback rate

Virtual media positions at some points in the example above would be calculated to:

at globalTime 110s:
50s media start position + (55s current position - 50s media start position) / 0.5 playback rate = 50s + 5s / 0.5 = virtual media position 60s
at globalTime 120s before next event:
50s media start position + (60s current position - 50s media start position) / 0.5 playback rate = 50s + 10s / 0.5 = virtual media position 70s
at globalTime 120s on next event:
60s media start position + (60s current position - 60s media start position) / 1 playback rate = 60s + 0s / 1 = virtual media position 60s
at globalTime 125s:
60s media start position + (65s current position - 60s media start position) / 1 playback rate = 60s + 5s / 1 = virtual media position 65s
at globalTime 135s for the next media after playing at -0.5 rate for a while:
10s media start position + (5s current position - 10s media start position) / -0.5 playback rate = 10s + -5s / -0.5 = virtual media position 20s

So at any given point you can simply feed that into suggestPlaybackRate(calculated globalStartTime, virtual media position) and get a rate adjustment back. To use the suggested rate, you multiply it with your desired playback rate to arrive at an adjusted rate. So for example:

calculate a globalStartTime when every event starts, from the first example: globalStartTime = 100s - 50s = 50s
at globalTime 110s the media is supposed to be at position 55s, but it is actuall at 54.75s due to some hickup
regularly calculate the virtual media position: 50s + (54.75s - 50s) / 0.5 = 50s + 9.5s = 59.5s
feed into suggestPlaybackRate at global time 110s: suggestPlaybackRate(50s calculated global start time, 59.5s virtual media position) = 1 + (110s - 50s) - 59.5s = 1.5
adjust the suggested playback rate to the desired playback rate of 0.5: 1.5 * 0.5 = 0.75
apply the playback rate of 0.75
loop the above with a reasonably interval (for example at every frame)
the playback rate will eventually go back to 0.5 after having corrected for the lateness of 0.25s

Note that suggestPlaybackRate is written so that it will correct an arbitrary offset over the course of one second. This can be too aggressive. Especially when the update interval of the playback rate is big, it will lead to swings above and below the target due to overcorrection. In such cases the suggested playback rate can be turned down by some factor:

1 + (suggested playback rate - 1) / some factor

leavittx commented 2 years ago

Thank you very much! I will see if I can implement the approach you've suggested and get back here.

There shouldn't be any offset between the server localTime and the globalTime, or rather there can't be, because they are the same. Are you sure that it isn't an issue with when the comparison happens, i.e. it being a frame late or similar?

I'm not entirely sure, though it's always constant delay in my experience. I'm just a bit concerned that there is no "suggestPlaybackRate" for server side, mixing playbackRate(on clients)/position(on server) based playback is a bad idea probably :-) Merging client/server would be a perfect thing to do for using it like that.

leavittx commented 2 years ago

Such events can then be sent to the clients along the way, always ensuring some comfortable delta between when the event shall happen and the current time so that the client has enough time to arm a timer or otherwise prepare what needs to be done (preloading another media, seeking).

Yeah, just saying that in case of live media installations we can't really rely on sending the seek time/play/pause events in advance unfortunately. But I hope the approach above can be used anyway (given that seeking is quite fast operation in my case)

DRIFTsync / driftsync

A better synced playback example #1