jaswdr / faker

:rocket: Ultimate fake data generator for Go with zero dependencies
https://pkg.go.dev/github.com/jaswdr/faker
MIT License
566 stars 59 forks source link

[BUG] v2.1.0 IntBetween still panic()'s... #169

Open sean- opened 8 months ago

sean- commented 8 months ago

Describe the bug This looks like a continuation of #164

To Reproduce This may still need to be entirely fixed. I can reproduce this about once every 50-100 runs or so. Using https://github.com/sean-/bench-go-histograms as the reproducer:

$ go test -test.bench=PromDuration -benchmem -count 200
goos: darwin
goarch: arm64
pkg: github.com/sean-/bench-go-histograms
Benchmark_PromDuration-10        5184980           229.5 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-10        5846218           217.5 ns/op         0 B/op          0 allocs/op
[snip]
Benchmark_PromDuration-10        5678696           234.6 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-10        5885458           226.5 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-10        5580679           225.2 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-10       panic: runtime error: index out of range [-1]

goroutine 635 [running]:
math/rand.(*rngSource).Uint64(...)
    /opt/homebrew/Cellar/go/1.22.0/libexec/src/math/rand/rng.go:249
math/rand.(*rngSource).Int63(0x9?)
    /opt/homebrew/Cellar/go/1.22.0/libexec/src/math/rand/rng.go:234 +0x8c
math/rand.(*Rand).Int63(...)
    /opt/homebrew/Cellar/go/1.22.0/libexec/src/math/rand/rand.go:96
math/rand.(*Rand).Int63n(0x140000a8c30, 0x2540be400)
    /opt/homebrew/Cellar/go/1.22.0/libexec/src/math/rand/rand.go:128 +0x68
math/rand.(*Rand).Intn(0xb?, 0x14000130000?)
    /opt/homebrew/Cellar/go/1.22.0/libexec/src/math/rand/rand.go:185 +0x44
github.com/jaswdr/faker/v2.Faker.IntBetween({{0x102946538?, 0x140000a8c30?}}, 0x1?, 0x2540be400?)
    /Users/seanc/go/pkg/mod/github.com/jaswdr/faker/v2@v2.1.0/faker.go:182 +0xb0
github.com/sean-/bench-go-histograms.randDuration(...)
    /Users/seanc/go/src/github.com/sean-/bench-go-histograms/bench_test.go:22
github.com/sean-/bench-go-histograms.Benchmark_PromDuration.func1(0x1400007e000)
    /Users/seanc/go/src/github.com/sean-/bench-go-histograms/bench_test.go:151 +0x64
testing.(*B).RunParallel.func1()
    /opt/homebrew/Cellar/go/1.22.0/libexec/src/testing/benchmark.go:797 +0xbc
created by testing.(*B).RunParallel in goroutine 610
    /opt/homebrew/Cellar/go/1.22.0/libexec/src/testing/benchmark.go:790 +0x104
exit status 2
FAIL    github.com/sean-/bench-go-histograms    15.354s

JFYI. The rate of this happening is now quite small, but it's not zero, either. I don't know if you want to open a new issue, or

jaswdr commented 8 months ago

Ack, I'm working to fix it.

jaswdr commented 8 months ago

@sean- can you confirm which version of Go are you using? I'm not able to reproduce it with 1.22.0.

mirobertod commented 5 months ago

@jaswdr I'm able to reproduce the error with go version go1.22.4 darwin/arm64

➜  bench-go-histograms git:(main) go test -test.bench=PromDuration -benchmem -count 200
goos: darwin
goarch: arm64
pkg: github.com/sean-/bench-go-histograms
Benchmark_PromDuration-8     3353565           334.2 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-8     3422002           354.4 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-8     3417480           338.4 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-8     3419114           342.6 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-8     3435717           304.7 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-8     4107938           303.9 ns/op         0 B/op          0 allocs/op
Benchmark_PromDuration-8    panic: runtime error: index out of range [-1]

goroutine 347 [running]:
math/rand.(*rngSource).Uint64(...)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rng.go:249
math/rand.(*rngSource).Int63(0x9?)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rng.go:234 +0x8c
math/rand.(*Rand).Int63(...)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rand.go:96
math/rand.(*Rand).Int63n(0x14000132ba0, 0x2540be400)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rand.go:128 +0x68
math/rand.(*Rand).Intn(0xb?, 0x140000fc000?)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rand.go:185 +0x44
github.com/jaswdr/faker/v2.Faker.IntBetween({{0x102c72538?, 0x14000132ba0?}}, 0x1?, 0x2540be400?)
    /Users/mirobertod/go/pkg/mod/github.com/jaswdr/faker/v2@v2.1.0/faker.go:182 +0xb0
github.com/sean-/bench-go-histograms.randDuration(...)
    /tmp/bench-go-histograms/bench_test.go:22
github.com/sean-/bench-go-histograms.Benchmark_PromDuration.func1(0x14000200000)
    /tmp/bench-go-histograms/bench_test.go:151 +0x64
testing.(*B).RunParallel.func1()
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/testing/benchmark.go:797 +0xbc
created by testing.(*B).RunParallel in goroutine 324
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/testing/benchmark.go:790 +0x104
exit status 2
FAIL    github.com/sean-/bench-go-histograms    9.958s

I'm also able to reproduce the error with the latest version of faker v2.3.0

goroutine 6142 [running]:
math/rand.(*rngSource).Uint64(...)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rng.go:249
math/rand.(*rngSource).Int63(0x9?)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rng.go:234 +0x8c
math/rand.(*Rand).Int63(...)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rand.go:96
math/rand.(*Rand).Int63n(0x14000122810, 0x2540be400)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rand.go:128 +0x68
math/rand.(*Rand).Intn(0xb?, 0x140000b6000?)
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/math/rand/rand.go:185 +0x44
github.com/jaswdr/faker/v2.Faker.IntBetween({{0x10085cf78?, 0x14000122810?}}, 0x1?, 0x2540be400?)
    /Users/mirobertod/go/pkg/mod/github.com/jaswdr/faker/v2@v2.3.0/faker.go:182 +0xb0
github.com/sean-/bench-go-histograms.randDuration(...)
    /tmp/bench-go-histograms/bench_test.go:22
github.com/sean-/bench-go-histograms.Benchmark_PromDuration.func1(0x1400020e360)
    /tmp/bench-go-histograms/bench_test.go:151 +0x64
testing.(*B).RunParallel.func1()
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/testing/benchmark.go:797 +0xbc
created by testing.(*B).RunParallel in goroutine 6155
    /opt/homebrew/Cellar/go/1.22.4/libexec/src/testing/benchmark.go:790 +0x104
exit status 2

EDIT: here's a smaller func to get the same result

func main() {
    f := faker.New()
    for i := 0; i < 2; i++ {
        go func() {
            for {
                f.IntBetween(0, 100)
            }
        }()
    }
    time.Sleep(time.Minute)
}
twocs commented 1 month ago

I was Hacktoberfest but am not seeing how to resolve the issue reported, as there is a dilemma in this bug report. The goal of a seeded random generator is for reproducibility of results. But if you have concurrent processes, how would you ensure that the exact sequence of "random" numbers would repeat every time? Over a decade ago, the rand package got code comments because of the exact problem that seeded random number generators aren't safe for concurrent use. https://github.com/golang/go/issues/3611

A Source represents a source of uniformly-distributed pseudo-random int64 values in the range [0, 1<<63). A Source is not safe for concurrent use by multiple goroutines. https://github.com/golang/go/blob/release-branch.go1.23/src/math/rand/rand.go#L11

Faker is using rand.NewSource(time.Now().Unix()). This is clearly indicated in the rand package as not safe for concurrent use. If you don't mind losing the ability to have a seed, you could simply change f.IntBetween to use rand.Intn instead of f.Generator.Intn, which will use the global rand which is threadsafe. But this will mean you cannot use a seed. In other words, is it really up to faker to provide a concurrent-safe random generator? The end user should be sure to use one new faker.New() per instance and it works without problem.

func main() {
    for i := 0; i < 2; i++ {
                 f := faker.New()
        go func() {
            for {
                f.IntBetween(0, 100)
            }
        }()
    }
    time.Sleep(time.Minute)
}

Another formulation of the benchmark, note that f := New() is inside the parallel run, not global.

func BenchmarkIntBetweenThreadsafeParallel(b *testing.B) {
    b.RunParallel(func(pb *testing.PB) {
        f := New()
        for pb.Next() {
            f.IntBetween(1, 100)
        }
    })
}

Assuming that the solution is to have one faker per concurrent thread, I might then suggest updating the seed to ensure that multiple faker.New created around the same time don't happen to have the same seed:

// New returns a new instance of Faker instance with a random seed
func New() (f Faker) {
-       seed := rand.NewSource(time.Now().Unix())
+   seed := rand.NewSource(rand.Int63())
    f = NewWithSeed(seed)
    return
}

So finally, if you don't need seeded random numbers, you might simply implement your own threadsafe generator that can be used by multiple goroutines.

// ThreadsafeGenerator is safe for concurrent use by multiple goroutines.
type ThreadsafeGenerator struct{}

// Intn returns, as an int, a non-negative pseudo-random number in [0,n).
// It panics if n <= 0.
func (m ThreadsafeGenerator) Intn(n int) int {
    return rand.Intn(n)
}

// Int returns a non-negative pseudo-random int.
// It is safe for concurrent use by multiple goroutines.
func (m ThreadsafeGenerator) Int() int {
    return rand.Int()
}

// Now that we use a threadsafe generator, we can use `f.IntBetween(1, 100)` concurrently.
func BenchmarkIntBetweenThreadsafeParallel(b *testing.B) {
    f := New()
    f.Generator = ThreadsafeGenerator{}

    b.RunParallel(func(pb *testing.PB) {
        for pb.Next() {
            f.IntBetween(1, 100)
        }
    })
}