go-gl / mathgl

A pure Go 3D math library.
BSD 3-Clause "New" or "Revised" License
554 stars 65 forks source link

Random data generation may significantly bias current benchmarks #21

Open UserAB1236872 opened 10 years ago

UserAB1236872 commented 10 years ago

I took out the random data from one benchmark function and the speed drastically improved (~40 ns/op faster), despite the random data being behind StopTimer calls.

I put in the random data to prevent the compiler from making optimization with constant data and biasing results (since they're really simple functions). However, I made the following simple change to VecSub:

func init() {
        rand.Seed(time.Now().UnixNano())
}

var v1 = Vec4{rand.Float32(), rand.Float32(), rand.Float32(), rand.Float32()}
var v2 = Vec4{rand.Float32(), rand.Float32(), rand.Float32(), rand.Float32()}

func BenchmarkVec4Sub(b *testing.B) {
    for i := 0; i < b.N; i++ {
        v1.Sub(v2)
    }
}

Because the compiler shouldn't be able to do any constant optimization for that (since it's based off the seed at runtime). And got ~12.6 ns/op, which is about as good as a SIMD package I benchmarked (~11.9 ns/op). This is down from previous benchmarks of the form

func BenchmarkVec4Sub(b *testing.B) {
        b.StopTimer()
    r := rand.New(rand.NewSource(int64(time.Now().Nanosecond())))

    for i := 0; i < b.N; i++ {
        b.StopTimer()
        v1 := Vec4{r.Float32(), r.Float32(), r.Float32(), r.Float32()}
        v2 := Vec4{r.Float32(), r.Float32(), r.Float32(), r.Float32()}
        b.StartTimer()

        v1.Sub(v2)
    }
}

Which were clocking ~50.9 ns/op. And also took an age to run. Further, introducing random data in the same form as our current benchmarks to the SIMD package's benchmarks catapaulted it up to the ~40-50ns/op range -- implying there's almost certainly no magical Go compiler trickery going on biasing our benchmarks to an unfairly low number.

This is actually very good news. I'm filing this because I don't have time for the next little bit if someone else wants to do this. I think all Matrix, Vector, and Quaternion benchmarks use random data now. It's not a horifically difficult thing to change, just tedious.

On the other hand, I'm not sure why b.StopTimer is being so finnicky, but it may just be meant for benchmarks at ns/op levels that aren't so low (~40ns/op is nothing compared to something like a graph search).

dmitshur commented 10 years ago
    for i := 0; i < b.N; i++ {
        b.StopTimer()
        v1 := Vec4{r.Float32(), r.Float32(), r.Float32(), r.Float32()}
        v2 := Vec4{r.Float32(), r.Float32(), r.Float32(), r.Float32()}
        b.StartTimer()

        v1.Sub(v2)
    }

Yeah, I'm not surprised that doesn't work as intended. StopTimer() and StartTimer() are likely too high-grained and meant to be used before/after the tight b.N loop, not inside.

Nice find and good news though!