councilforeconed / interactive-activities

Council for Economic Education
http://interactives.councilforeconed.org
Mozilla Public License 2.0
6 stars 2 forks source link

Pizza production countdown clocks not in sync #141

Closed cbujara closed 8 years ago

cbujara commented 9 years ago

When pizza production is played on multiple devices, countdown clocks are not in sync. Some player's clocks start at less than 1'30", and the current round stops when the fastest countdown clock reaches zero. The lag does not appear to be related to the relative times at which each player clicks their start button. Problem is not observable when testing in multiple private windows on a single device.

jugglinmike commented 8 years ago

I spent some time trying to reproduce this bug, but I have yet to experience it.

This might seem kind of obvious, but I have to ask: are you certain the clients were connected to the same room? I'm wondering because it would be easy to confuse rooms; the name of the current room is only visible in the URL. If the different clients in your original report were actually connecting to different rooms, then it would make sense that their clocks were not synchronized.

Otherwise, to increase the odds of my reproducing this bug, we should try to minimize environmental differences. I was connecting using two desktop clients across the same local network (one of which was running the application server itself), and I maintained between 0 and 12 concurrent connections.

When you experienced this bug:

cbujara commented 8 years ago

I know, it's a tough one! The last time I experienced this, we had 4 distinct computers, a mix of Macs and PCs, no mobile devices, one connection per device. I can't say if there were other connections to the server at the same time. We were running against the production server. All were using the same virtual room - we could see the count go up as each person joined, and the simulation started after the last person joined. The next round also started at the same time for everyone.

Just to try to clarify the problem, the round starts at the same time for everyone who is participating in that round, but somehow, the round ends and some people still have time on their clocks. I'm not sure if one of the clocks is starting with less time on it, or is skipping ahead, or if some clocks are running slow or start with more time on them. I remember someone had about 20 seconds left and was just about to finish a pizza when the round ended. It's tough rounding up enough people to test this with multiple devices, but if we can come up with a list of things to try to observe during the process, I think I can wrangle some testers.

jugglinmike commented 8 years ago

My latest theory is that the clocks are somehow dependent on the local system's time. This one will be easy to falsify; I'll report back once I do.

cbujara commented 8 years ago

It's occurred to me that at least one of the participants in the last test session I arranged was likely using a laptop configured for a different time zone. Perhaps that was a factor?

jugglinmike commented 8 years ago

Possibly! Although I'm still inclined to suspect general clock drift since time zone errors would tend to introduces larger discrepancies than what you've experienced. We'll see soon enough

jugglinmike commented 8 years ago

Alright, I think I've gotten it. I can speak from experimental results only; I haven't dug in to the relevant code yet.

This does appear to be a problem with system clock synchronization. To demonstrate, I hacked up the UI a little to display the current system's clock, and I connected via my local machine (whose clock I manually set to be a few minutes ahead of the correct time) and a remote machine:

clock sync

Time zone doesn't seem to have an effect (most likely because we're using Unix timestamps consistently throughout the application).

Now that I can reproduce the issue consistently, fixing it should be straightforward. It's still too early to say how much time/effort this will take, but at least we know what we're dealing with now!

cbujara commented 8 years ago

Excellent sleuthing. Thx!

jugglinmike commented 8 years ago

After a little research, I've identified the cause of the problem. I was worried that we had made a rookie mistake like assuming all client clocks would be in sync with the server, but (luckily for my pride) this was not the case.

We are actually being a little sloppy with the data shared between the client and the server. Although each client calculates its own "time remaining" value based on a relative duration provided by the server, this localized value is overwritten by the server's version.

This can be fixed naively with one line of code within the Pizza Productivity business logic, but it's indicative of a more generic problem and warrants a correspondingly general fix. I'm still experimenting with how to best address this, but I consider this good progress so far.

cbujara commented 8 years ago

Just did some testing on multiple devices with slightly out-of-sync clocks. The timers stayed in sync after the first few ticks, and each round ended simultaneously for each participant, as intended. Nice work!

One odd thing I noticed, now that I'm paying more attention to clocks. The timers count down from 1:30, but the round doesn't start until the 'Get ready' and 'You're sitting this one out' messages clear. How long that takes seems to vary. On my last test run, round 1 started at 1:19, round 2 started at 1:22, rounds 3 and 4 started at 1:23. I'm wondering if the original intent was to allow a full 90 seconds per round, which would suggest that the timer shouldn't start counting down until the round-opening messages clear. Do you recall any discussion regarding that?

jugglinmike commented 8 years ago

I can't remember any explicit discussion. The initial countdown does eat away at playable time, but it's intended purpose is to give the player some indication as to what's going on. This is maybe most important when their "active" state changes from round-to-round... but beyond just saying, "okay, you are playing now", the modal dialog also serves players who were already activated--it brings explicit attention to the change in rounds.

Separate from the experience design, the discrepancy in start times sounds like another bug. I'm unable to reproduce it using the same approach I described above, though, so we may have to dig further.

One thing that may not be clear initially is that the "get ready"/"sitting out" messages are entirely local. We make no attempt to synchronize them between clients; the intention is to show them for a constant five seconds in two cases: (1) the beginning of each round, (2) the moment the player joins an in-progress game.

In your tests, are all players joining the game at the same time? I ask because it is possible for a player to enter the game late. In these cases, the round timer will have already decremented by some amount, and it will continue to do so through the "get ready"/"sitting out" messages.

cbujara commented 8 years ago

I'm just testing with 4 chefs, so the simulation starts simultaneously for everyone as soon as the 4th chef clicks Get Started on the intro screen. I'm going to run through it a few more times and see if what I'm observing is happening consistently. If I understand correctly, the intended behavior is for the timer to start at 1:30 for all chefs, the get ready/sitting out message displays for 5 seconds, then active chefs for that round should have access to the workspace by 1:25. Does that sound right?

jugglinmike commented 8 years ago

Yup, that sounds right

cbujara commented 8 years ago

Clocks are occasionally off by a second or so, but since we're not lobbing space probes into orbit or splitting atoms, some drift is acceptable. The issue of large discrepancies that were impacting the simulation is clearly resolved.