Closed cpl closed 3 years ago
Oh ya of course it's a good improvement, I was just curious if you had run any benchmarks to see just how much of an improvement it actual is.
No benchmarks, sorry :( .
If there are no benchmarks then how is it an optimisation? Just saying "x is slower than y" without proving it is cargo cultism.
@Asday Fundamentals of Computation? O(1) < O(N) [the worst case complexity of append)
If you know the size of the thing you're allocating ... then allocate it all at once instead of allocating in small increments. Anyhow, free code, take it or leave it.
Without benchmarks, how do you know you're hitting the worst case complexity of append?
Secondarily, without benchmarks, how is a reviewer meant to know whether the work is good or not? You can lord your superiority over them and say "SURELY you know the big O complexity of everything in the language" as much as you like, but it doesn't make it a good change.
What's your aversion to writing benchmarks?
I'd write gs.uf = make([]glhf.Attr, 0, len(gs.uniforms))
instead of gs.uf = nil
to avoid unnecessary allocations.
Insert Thanos 'fine I'll do it myself' ;)
I broke the questioned code into its own function so I could easily write a test for it and to not have to bother with OpenGL stuff, we're just looking at slices today.
func (gs *GLShader) updateUF(old bool) {
if old {
gs.uf = nil
for _, u := range gs.uniforms {
gs.uf = append(gs.uf, glhf.Attr{
Name: u.Name,
Type: u.Type,
})
}
} else {
gs.uf = make([]glhf.Attr, len(gs.uniforms))
for i := range gs.uniforms {
gs.uf[i] = glhf.Attr{
Name: gs.uniforms[i].Name,
Type: gs.uniforms[i].Type,
}
}
}
}
And my test file.
func TestGLShader_updateUF_old(t *testing.T) {
tests := []struct {
name string
gs *GLShader
}{
// {
// name: "zero",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 0),
// },
// },
{
name: "100",
gs: &GLShader{
uniforms: make([]gsUniformAttr, 100),
},
},
// {
// name: "1000",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 1000),
// },
// },
// {
// name: "10000",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 10000),
// },
// },
// {
// name: "100000",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 100000),
// },
// },
// {
// name: "1000000",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 1000000),
// },
// },
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tt.gs.updateUF(true)
})
}
}
func TestGLShader_updateUF_new(t *testing.T) {
tests := []struct {
name string
gs *GLShader
}{
// {
// name: "zero",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 0),
// },
// },
{
name: "100",
gs: &GLShader{
uniforms: make([]gsUniformAttr, 100),
},
},
// {
// name: "1000",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 1000),
// },
// },
// {
// name: "10000",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 10000),
// },
// },
// {
// name: "100000",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 100000),
// },
// },
// {
// name: "1000000",
// gs: &GLShader{
// uniforms: make([]gsUniformAttr, 1000000),
// },
// },
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
tt.gs.updateUF(false)
})
}
}
For 100 uniforms: old: 0.120s new: 0.116s
For 1000000 uniforms old: 0.249s new: 0.147s
Now it's probably likely that even 100 uniforms in practice is a bit much, but there is a measurable performance difference so I'm comfortable merging this change in.
append(x, y) makes a copy of x, a temp slice for y. Then it has to put them together, but if x doesn't have enough capacity, it will have to reassign memory for a new slice. So it does lots more memory allocations (based on the number of times you append, so the length of uniforms) than my example doing 1 allocation. In any case, obvious improvement/s.