enjoythecode commented 2 years ago

Games should have time control: each player starts with some given amount of time and a bonus time per move (potentially 0). This time counts down while they are on the move. Game is lost when a players time is up.

Design document needed to scope this feature and consider implementation strategies.

enjoythecode commented 2 years ago

Design document should include:

Scope (what is in, what is out?)
Requirements
Implementation Options
Chosen Implementation

enjoythecode commented 2 years ago

Planned design is below. This will also be included in the PR with up-to-date references to the code.

Summary

This document is about implementing timed games where each player has a certain amount of time available to move, and running out of time results in a loss.

Scope

Included

Storage and data model of time control data
Ensuring that no moves can be played after time is up
Automatically concluding games that have ended on time in a timely manner
Initializing games with different time control settings
Not Included
Lag compensation

Proposal

Worker Task

There will be a task in the backend task that atomically and idempotently terminates a game if it is still in progress and the time is up for the player that has the move. This function will be called in a few places to ensure that

Data state is consistent
Data state is updated in a timely manner.

As a baseline, the existing time control will be checked with each move attempt, and the game will be terminated if the move was too late.

However, we also need to run this task even without a prompt from any clients. For this, there are two options available:

Option A: Enqueue a task for every move

In this option, a task to terminate the game if a players time is up would be enqueued after every move with a delay of the remaining time. With this setup, if no other moves happen, the time will be up when the task will due and it will correctly terminate the game.

It might also be possible to store a reference to the previous task enqueued with this option and cancel it on the receipt of a new move to reduce the load on the workers.

Pros:

Computationally lazy: only makes computations for moves that were actually played.
Prompt; clean-up will run almost perfectly. Cons:
The message queue can not guarantee that jobs won’t be lost; would lead to leaks without a periodic clean-up task.

Option B: Periodic Clean-up Task

In this option, a task runs periodically (every 1-3 seconds), checks all the games, and terminates any that ran out of time.

Pros:

No memory leaks; this will capture everything Cons:
Computationally expensive. For this to be cheaper than Option A, the average number of seconds per move in a game must be less than the periodicity.
May not be prompt. If a game ran out of time just after the task ran, there will be a noticeable 3-second delay in the front-end!

Chosen Implementation

Option A will be implemented as the primary method of cleaning up tasks. Option B will be ran with a longer delay (~10-15 seconds) to catch any games that might have leaked from the data store.

Display

Time will count down in the client using a local countdown. This countdown will calculate the remaining time using the timestamp delivered by the server. This will prevent the error in the countdown from growing (compared to a naive implementation that counts up using the delay for the interval every interval).

Technical Details

All timestamps related to this feature will be stored in the UTC timezone since we are only concerned with differences between different timestamps.

enjoythecode commented 2 years ago

addendum:

Option C

Option A and B are "push" models, where the server handles the scheduling of the event. In contrast, a "pull" model is possible where the client requests an execution of the "terminate-game-if-time-is-over" task when it has detected that time is up for one of the players.

This has better response time because the task is run when relevant, and not at other times. There are edge cases where both clients may fail to push a request, so, this approach will be complemented with a periodic sweep (Option B). This is what AWS calls "anti-entropy sweepers". This sweep can be significantly less frequent since it handles edge cases where the players are not available, and therefore the termination of the game need not be prompt for UX.

It is especially important that this task be idompotent, atomic, and use proper resource locking, because both clients might fire the request in very short succession and this could lead to a race condition! Also, to prevent the burst, only the clients who are playing in the game should send requests (observers should not).

Pros:

Just-enough execution of the task (less load on server)
Just-in-time execution of the task (UX is fast)

I will be using this approach for the implementation.

enjoythecode / strate.gg

Add time controls #35

Summary

Scope

Included

Not Included

Proposal

Worker Task

Option A: Enqueue a task for every move

Option B: Periodic Clean-up Task

Chosen Implementation

Display

Technical Details

Option C