heroiclabs / nakama

Distributed server for social and realtime games and apps.
https://heroiclabs.com
Apache License 2.0
8.58k stars 1.06k forks source link

Fix event loop termination in statusRegistry #1201

Closed thesprockee closed 2 months ago

thesprockee commented 2 months ago

Change return to continue in error handling to prevent the statusRegistry event loop from terminating.

CLAassistant commented 2 months ago

CLA assistant check
All committers have signed the CLA.

thesprockee commented 2 months ago

This switches from return to continue in statusRegistry's event loop error handling to prevent statusRegistry.eventsCh queue from overflowing.

This adjustment maintains the tracker event loop operation, which will terminate when the statusRegistry event queue overflows.

If the goroutine for the tracker event queue loop terminates, joinattempt markers will not be cleared causing match participants to be ejected from a match in ~15 seconds.

zyro commented 2 months ago

@thesprockee I think strictly speaking this change would be correct, but I'd expect that error to only actually manifest in extreme scenarios like running out of memory etc - have you actually observed this error?

thesprockee commented 2 months ago

Yes, I only noticed it after I integrated a proprietary WebSocket protocol--specific to a VR game--into Nakama. Initially, I incorrectly used a "return" statement instead of "continue" upon detecting sessions using this proprietary protocol. Below is the original code segment, which helps illustrate why I initially overlooked the error. The repercussions became apparent only when the statusRegistry.eventsCh buffer exceeded 1024 events.

status_registry.go:

var err error
switch session.Format() {
case SessionFormatEvr:
    // Evr does not support status events.
    return
case SessionFormatProtobuf:
    if payloadProtobuf == nil {
        // Marshal the payload now that we know this format is needed.
        payloadProtobuf, err = proto.Marshal(envelope)
        if err != nil {
            s.logger.Error("Could not marshal status event", zap.Error(err))
            return
        }
    }
    err = session.SendBytes(payloadProtobuf, true)
case SessionFormatJson:
    fallthrough