gubernator-io / gubernator

High Performance Rate Limiting MicroService and Library - Developed at Mailgun
Apache License 2.0
93 stars 7 forks source link

After v2.4.0 upgrade getting occasional nil pointer panics #14

Closed ihqtim closed 4 months ago

ihqtim commented 4 months ago

Hi - Last week I upgraded to latest gubernator version v2.4.0 from the old repo (https://github.com/mailgun/gubernator) and since then some servers have occasionally hit a nil pointer (panic). Previously we were running v2.2.1, never saw this issue.

Starting investigation, adding here in case core team can spot an obvious problem / cause, stack below. It looks like it is possible to hit gubernator.go:334 with either resp or err being nil?

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1d4bc8e] github.com/mailgun/gubernator/v2.(*V1Instance).asyncRequest(0xc00034cf00, {0x3402690?, 0xc00a770510?}, 0xc0080d8f40)
github.com/mailgun/gubernator/v2.(*V1Instance).asyncRequest(0xc00034cf00, {0x3402690?, 0xc00a770510?}, 0xc0080d8f40)
[redacted-go-project-name]/vendor/github.com/mailgun/gubernator/v2/gubernator.go:334 +0x104e
github.com/mailgun/gubernator/v2.(*V1Instance).GetRateLimits in goroutine 49872128
[redacted-go-project-name]/vendor/github.com/mailgun/gubernator/v2/gubernator.go:275 +0xa25
Main process exited, code=exited, status=2/INVALIDARGUMENT
ihqtim commented 4 months ago

Taking this hastily assembled go playground example:

package main

import (
    "fmt"
    "strconv"
)

func main() {

    var err error
    var parsedValue int
    for i := 0; i < 3; i++ {
        fmt.Printf("Error value at start of loop %v %v\n", i, err)
        if i == 0 {
            parsedValue, err = strconv.Atoi("1")
            fmt.Printf("-- parsed value at loop %v: %v, err=%v\n", i, parsedValue, err)
        }
        if i == 1 {
            val2, err := strconv.Atoi("y") // The colon here!
            fmt.Printf("-- parsed value at loop %v: %v, err=%v\n", i, val2, err)
            if err != nil {
                err = fmt.Errorf("Error while running loop %v, %v", i, err)
            }

        }

    }

    fmt.Printf("the value & error after loop %v, %v\n", parsedValue, err)
}

The output is:

Error value at start of loop 0 <nil>
-- parsed value at loop 0: 1, err=<nil>
Error value at start of loop 1 <nil>
-- parsed value at loop 1: 0, err=strconv.Atoi: parsing "y": invalid syntax
Error value at start of loop 2 <nil>
the value & error after loop 1, <nil>

Program exited.

Wondering if gubernator.go:355 could be the cause of the nil pointer - pseudocode

    var attempts int
    var err error // err declared in outer block
    resp := AsyncResp{
        Idx: req.Idx,
    }
    for {
        if attempts > 5 {
            // *** nil pointer panic below if err is nil
            resp.Resp = &RateLimitResp{Error: err.Error()}
            break
        }

        // err here is not outer err!
        r, err := req.Peer.GetPeerRateLimit(ctx, req.Req)
        if err != nil {
            attempts++
            continue

        }

    }
thrawn01 commented 4 months ago

Thank you for reporting this issue! I think I found the issue, please review and approve #16 if you agree.