gschup / ggrs

GGRS is a reimagination of GGPO, enabling P2P rollback networking in Rust. Rollback to the future!
Other
507 stars 25 forks source link

Add option for comparing checksums in `P2PSession`s #46

Closed johanhelsing closed 1 year ago

johanhelsing commented 1 year ago

Is your feature request related to a problem? Please describe.

  1. Sometimes, de-syncs will only trigger due to platform-specific differences. These issues will not be detected with a regular synctest session.
  2. In the unfortunate case that de-syncs are extremely very rare, and we can't find the cause of the bug, it would be good with some mechanism to alert the user, stop the ggrs session, ideally saving the game and syncing state from one of the peers and starting a new session.

Describe the solution you'd like

As a starting point, it would be good if there was an option on the p2p session similar to that on synctest session, that took checksums of the state and sent it to the other peers so they could compare.

Describe alternatives you've considered

Workaround: Implementing a checksum in game code and include it as part of the input struct.

Or: Just hope desyncs won't happen.

Additional context

Suggested by @Vrixyz :)

gschup commented 1 year ago

Sending a checksum as part of any existing package (or a new one) and tracking/comparing received checksums with own checksums should neither be a big burden on network traffic nor require a lot of code changes. The biggest question for me is how to go about sending them, as messages are unreliable. Would it be enough to just get sporadic comparisons or do we want to compare every frame?

johanhelsing commented 1 year ago

Sporadic checksums would probably be fine for my use-case. It would probably go a long way to to just print out the first known bad and the last known good frames.

We're adding options for configuring reliable channels in matchbox as well, so it would be possible to run it in reliable mode for debugging desync issues.

gschup commented 1 year ago

A first idea could look something like this:

#[derive(Copy, Clone, Debug, PartialEq, Eq, Serialize, Deserialize, Default)]
pub(crate) struct ChecksumReport {
    pub checksum: u32, // or whatever datatype we deem suitable
    pub frame: Frame, // this is just a i32
}

Each endpoint could keep a history of recent checksums (like a Hashmap<Frame, u32> or some queue) and then compare received checksums (sporadic) against their history (complete). In case of a detected desync, we send a GGRSEvent out to the user that can be handled however they wish.

This way the clients are not guaranteed to agree on a detected mismatch, it might just be one client that detects it. But since this feature is probably just for debugging, this could be "good enough" already. The clients don't have to agree on the mismatch, as long as they properly disconnect from the session if they stop participating.