uber-go / goleak

Goroutine leak detector
MIT License
4.46k stars 147 forks source link

goleak finds goroutine leak in not enabled test case #83

Open szuecs opened 1 year ago

szuecs commented 1 year ago

If you have a module with a leak in func leak():

package foo

import (
    "sync"
    "time"
)

type O struct {
    once *sync.Once
    quit chan struct{}
}

func NewO() *O {
    o := &O{
        once: &sync.Once{},
        quit: make(chan struct{}),
    }
    return o
}

func (o *O) leak() {
    go func() {
        d := 100 * time.Millisecond
        for {
            select {
            case <-o.quit:
                return
            case <-time.After(2 * d):
                time.Sleep(d)
            }

        }

    }()
}

func (o *O) close() {
    o.once.Do(func() {
        close(o.quit)
    })
}

func (*O) run(d time.Duration) {
    time.Sleep(d)
}

and tests checking for leaks by running defer goleak.VerifyNone(t), that don't run leak(), it will show a leak in TestC, even if the leak is in TestB without defer goleak.VerifyNone(t):

package foo

import (
    "testing"
    "time"

    "go.uber.org/goleak"
)

func TestA(t *testing.T) {
    defer goleak.VerifyNone(t)

    o := NewO()
    defer o.close()
    o.run(time.Second)
}

func TestB(t *testing.T) {
    o := NewO()
    o.leak()
    o.run(time.Second)
}

func TestC(t *testing.T) {
    defer goleak.VerifyNone(t)

    o := NewO()
    defer o.close()
    o.run(time.Second)
}

It depends on order of execution of tests.

Here a test run:

% go test .
--- FAIL: TestC (1.45s)
    foo_test.go:30: found unexpected goroutines:
        [Goroutine 6 in state select, with foo.(*O).leak.func1 on top of the stack:
        goroutine 6 [select]:
        foo.(*O).leak.func1()
                /tmp/go/goleak/foo.go:25 +0x85
        created by foo.(*O).leak
                /tmp/go/goleak/foo.go:22 +0x56
        ]
FAIL
FAIL    foo     3.466s
FAIL
JacobOaks commented 1 year ago

Hey,

So I've been looking into this a bit, and at its core, this issue is very similar to supporting t.Parallel (#16). We basically want to be able to filter goroutines to only the ones whose ancestry can be linked to the specific test that VerifyNone is being called from.

This is a bit of a non-trivial problem. One idea is to use tracebackancestors, one of the runtime GODEBUG variables, but as pointed out in #70, this doesn't soundly catch all cases because goroutines whose parents are killed won't be able to be traced back to their original test function.

Another idea is to keep some sort of internal ignore list for goroutines to filter by that gets augmented at the end of every test. This is the equivalent of calling IgnoreCurrent after every test function and building up a large ignore list, but doing it internally. But this (a) won't work with parallel and (b) will require making some API call at the end of every test function, so that's not very satisfying either.

To solve this issue, we'd want some sort of way to attach baggage to created goroutines so that we can make them aware of the test function they originated from, but unfortunately this is not something Go supports.

Liuhaai commented 1 year ago

same bug happens to me. The goleak lib needs to be removed in the impacted tests to make CI work

appleezhang commented 10 months ago

same bug happens to me. The goleak lib needs to be removed in the impacted tests to make CI work

abhinav commented 10 months ago

Just echoing what @JacobOaks said: this isn't easily solvable. There was some prior discussion here: https://github.com/uber-go/goleak/issues/16#issuecomment-575020250

Basically, there's no easy way to associate goroutines to the test that spawn it when used with t.Parallel. The recommended workaround is to use the VerifyTestMain variant of detecting leaks.

@JacobOaks medium term, we should probably document this in VerifyNone docs: if the test uses t.Parallel, then there's only so much goleak can do, and you should probably use VerifyTestMain in that case.

Long-term, if the runtime gives us the appropriate information to correlate the goroutine to the test that spawned it, we should definitely try to use that.