Closed mhr3 closed 2 weeks ago
Nice, thanks.
Awesome find!
func getMint64(b []byte) int64 {
return int64(binary.BigEndian.Uint64(b[1:9]))
}
... is a shorter version with only one check (spits out same code on amd64) - which is also inlined.
No need to change
If you can get m.R.Next(...)
and m.R.Peek(...)
to inline the "happy path" that would probably give a similar speedup.
Tried it briefly, but even...
func (r *Reader) Next(n int) ([]byte, error) {
// Have happy path be short and possible to inline.
if len(r.data)-r.n >= n && r.state == nil {
return r.data[r.n:], nil
}
return r.next(n)
}
// next is a helper function for Next to be called when the buffer does not contain n entries.
func (r *Reader) next(n int) ([]byte, error) {
...
...is too costly.
Also m.R.Peek(1)
is so common that could have a specialized call func (r *Reader) PeekByte() (byte, error)
- but again I couldn't get it to inline with some basic attempts.
It was slightly faster by having the "happy path" first, but nothing major. Maybe like 5%.
Awesome find!
func getMint64(b []byte) int64 { return int64(binary.BigEndian.Uint64(b[1:9])) }
... is a shorter version with only one check (spits out same code on amd64) - which is also inlined.
No need to change
I did try this, but from my benchmarking this was teeny bit slower (on arm64), did not check the generated assembly though.
I've noticed msgp's
integers.go
isn't using compiler tricks to remove bounds checks, after adding those benchmarks show pretty nice speedup:The most significant speedup can be seen in the unix put/get functions - for some reason the compiler wasn't inlining them, but after using functions from the binary pkg it now is. (though doing the same for all the other functions showed a slight slowdown, so I didn't change those)