golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.7k stars 17.62k forks source link

x/time/rate: limiter frequency does not correspond to expected frequency #37058

Open lebovski opened 4 years ago

lebovski commented 4 years ago

What version of Go are you using (go version)?

$ go version
go version go1.13.7 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/alx/.cache/go-build"
GOENV="/home/alx/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/alx/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build377469486=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I use rate limiter from standard library in order to limit the number of cycles of a some operation. The actual RPS is printed every second.

package main

import (
    "context"
    "log"
    "sync/atomic"
    "time"

    "golang.org/x/time/rate"
)

func main() {
    var (
        rps      = 100000
        duration = 10 * time.Second
    )

    requests := rps * int(duration.Seconds())

    ctx, endCancel := context.WithTimeout(context.Background(), duration)
    defer endCancel()

    count := newCounter(ctx)
    count.logState()

    limiter := rate.NewLimiter(rate.Every(time.Second/time.Duration(rps)), 1)
    for i := 0; i < requests; i++ {
        count.inc()

        err := limiter.Wait(ctx)
        if err != nil {
            log.Println("Limiter error", err)
            break
        }
    }
}

type counter struct {
    ctx context.Context
    i   int64
}

func newCounter(context context.Context) *counter {
    return &counter{
        ctx: context,
        i:   0,
    }
}

func (c *counter) inc() {
    atomic.AddInt64(&c.i, 1)
}

func (c *counter) get() int64 {
    return atomic.LoadInt64(&c.i)
}

func (c *counter) logState() {
    go func() {
        for {
            select {
            case <-c.ctx.Done():
                return
            default:
            }

            before := c.get()
            time.Sleep(time.Second)
            after := c.get()
            log.Println(after - before)
        }
    }()
}

What did you expect to see?

Expected that the actual RPS would be about 100000, but in fact it's only about 26000. Also the CPU is underloaded (CPU usage is around 60-70% on my hardware, so it doesn't even consume a single core).

What did you see instead?

2020/02/05 16:59:06 26208
2020/02/05 16:59:07 26139
2020/02/05 16:59:08 25750
2020/02/05 16:59:09 25942
2020/02/05 16:59:10 25252
2020/02/05 16:59:11 25847
2020/02/05 16:59:12 25949
2020/02/05 16:59:13 25353
2020/02/05 16:59:14 26985
2020/02/05 16:59:15 Limiter error rate: Wait(n=1) would exceed context deadline
randall77 commented 4 years ago

I can reproduce on Linux. Both 1.13.7 and tip.

I can not reproduce on Darwin - that can do ~96K/sec.

Increasing the burst size fixes this on Linux. Using 10 instead of 1 fixes the problem. At a rate of 100K/sec, you should probably allow bursts >1.

Not sure what the underlying problem is. Timer quantization, perhaps?

zhouzhenling commented 4 years ago

limiter.Wait(ctx) will call time.NewTimer() if need to wait. I make a test for time.NewTimer(d) using similar parameters with @lebovski. And I find total call time.NewTimer(d) does not correspond to expected. And I find the smaller the duration d , the greater the deviation.

env: go version go1.14 linux/amd64

// test code

package main

import (
    "context"
    "fmt"
    "time"
)

func testTimer(d time.Duration, ctx context.Context) (count int) {
    for {
        t := time.NewTimer(d)
        select {
        case <-ctx.Done():
            return
        case <-t.C:
            count++
        }
    }
    return
}

func main() {
    d := 10 * time.Microsecond
    timeout := 10 * time.Second

    ctx, _ := context.WithTimeout(context.Background(), timeout)

    fmt.Printf("want %d, got %d.\n", int64(timeout/d), testTimer(d, ctx))
}

// output:
// want 1000000, got 60830.

As @randall77 said: increasing the burst size can fixes the problem. That's because increasing the burst size will reduce to call time.NewTimer().