golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.99k stars 17.67k forks source link

proposal: x/time/rate: WaitAndReport to also return wait time #68719

Open mitar opened 3 months ago

mitar commented 3 months ago

Proposal Details

Proposal Details

Currently, golang.org/x/time/rate.Limiter provides handy Wait and WaitN methods. But when using them, it is not really possible to know later on if they have waited at all (so if rate limiting was applied) or not. Just observing from outside there might be regular overhead executing the method which added delay. But knowing if rate limiting was applied or not is useful so that one can log that, compute stats, know how often rate limiting is necessary and if maybe rate limits have to be increased.

I am interested in the use case where I use Limiter to limit my own calls towards a 3rd party API endpoint. I want to collect stats for our services how often and how much rate limiting we had to do on our side. If that rate limiting is too high, we can consider buying a higher class service from a 3rd party API provider. We collect stats how long each API call took, but without knowing how long were we waiting on our side for rate limiting, it is not enough to know why some calls took longer than other. We can measure from outside how long did call to Wait and WaitN took, but that then also measures overhead of calling them.

So I propose that Wait and WaitN methods of Limiter return wait time, literally, the value of delay variable the method wait computes. wait is already internal implementation so if we want to keep backwards compatibility of Wait and WaitN methods we can introduce some new methods like WaitAndReport which would call same internal wait, just return the delay value.

gabyhelp commented 3 months ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

mitar commented 3 months ago

cc @Sajmani

ianlancetaylor commented 3 months ago

Changing those methods would be a breaking change for people using the current package. We aren't going to do that.

mitar commented 3 months ago

This is why I am proposing additional methods at the end of my proposal, like WaitAndReport. Internally it can call into the same internal wait method.

mitar commented 3 months ago

I currently re-implemented Wait myself:

func wait(ctx context.Context, limiter *rate.Limiter, n int) (time.Duration, errors.E) {
    now := time.Now()

    // Check if ctx is already cancelled.
    select {
    case <-ctx.Done():
        return 0, errors.WithStack(ctx.Err())
    default:
    }

    r := limiter.ReserveN(now, n)
    if !r.OK() {
        return 0, errors.Errorf("rate: Wait(n=%d) exceeds limiter's burst", n)
    }

    // Wait if necessary.
    delay := r.DelayFrom(now)
    if delay == 0 {
        return 0, nil
    }

    // Determine wait limit.
    if deadline, ok := ctx.Deadline(); ok && deadline.Before(now.Add(delay)) {
        // We cancel the reservation because we will not be using it.
        r.CancelAt(now)
        return delay, errors.Errorf("rate: Wait(n=%d) would exceed context deadline", n)
    }

    timer := time.NewTimer(delay)
    defer timer.Stop()

    select {
    case <-timer.C:
        // We can proceed.
        return delay, nil
    case <-ctx.Done():
        // Context was canceled before we could proceed. Cancel the
        // reservation, which may permit other events to proceed sooner.
        r.Cancel()
        return time.Since(now), errors.WithStack(ctx.Err())
    }
}

But because reserveN is not public (with its maxFutureReserve argument), one had to first do a reservation and then cancel it if it over context timeout, so not exactly the same.

icholy commented 3 months ago

Perhaps Limiter can have a callback which is invoked when the limit is hit.

mitar commented 3 months ago

@icholy Not sure how callback would solve this issue? Wait function already returns when the limit is hit.

icholy commented 3 months ago

The usecase you described was stats collection. The callback could be used to capture/publish those.

mitar commented 3 months ago

I am interested in exposing the internal state (delay value) of the Limiter, not sure if callback is the way to expose that, but sure.