WIP: Pre-allocate buffers, fix go 1.10 compile issues

speps / go-hashids

Go (golang) implementation of http://www.hashids.org

MIT License

1.32k stars 109 forks source link

WIP: Pre-allocate buffers, fix go 1.10 compile issues #44

Closed cmaloney closed 6 years ago

cmaloney commented 6 years ago

Fix building hashid command line
Fix go 1.10 warning about Error vs. Errorf
Remove unused function
Reduce memory allocations in tests by over half, greatly reducing allocations per encode and decode, significantly reducing amount of garbage which collects

Memory usage of go-hashids.test

go test -memprofile mem.out -test.memprofilerate=1
go tool pprof --alloc_space go-hashids.test mem.out

before: 1314.59kB

after: 592.42kB

cmaloney commented 6 years ago

in some internal tests ran into some issues with this, debugging exactly what / where

cmaloney commented 6 years ago

So the problem here is that the code base allocated one hashid encoder globally, then used it in many goroutines, resulting in data races / multiple goroutines simultaneously editing the buffers.

How to proceed for up-streaming this need some guidance on what you'd like, my thoughts on a couple possible routes:

Re-use the buffers as this does. API is technically still sound, breaks some current usage patterns
Add a new api which uses the internal buffers, and leave the current apis alone
introduce a "encode/decode context" type struct which holds all the storage, if a user wants to operate highest efficiency they use it to encode/decode, otherwise one automatically gets constructed for them.

cmaloney commented 6 years ago

Planning to do the third option here, introduce some new structs for holding the buffers, the existing api will allocate those struct objects, or the struct objects can have encode/decode called on them which will provide more efficiency for people who refactor their apps, should get less allocations in all cases

speps commented 6 years ago

Third option is fine as it doesn't seem like it would change the API.

cmaloney commented 6 years ago

Added a "HashWorkspace" which does pre-allocated buffer + Hasher interface for utility (both HashID and HashWorkspace implement it, makes migration easier).

cmaloney commented 6 years ago

Rolled this out in the company I work at, got a 3% performance improvement across the board, and in cases where we were encoding a lot of ids (30k+) very large improvements.

Will rebase and pickup the new Hex API soon

dolmen commented 6 years ago

Too many unrelated commits in this pull request.

speps commented 6 years ago

Good job on reducing the allocations! It seems worth it but I really don't like hard coded numbers for allocations here and there. I don't know if it's at all possible but ideally, all of the values should be determined by how many integers are being encoded and their range. What I mean is a more thorough test suite should be used to determine the allocations instead of just the unit tests which are quite basic.

cmaloney commented 6 years ago

I'm no longer working on this at all (working on completely different things), so not going to be following this up anytime soon with lots of changes, will make a new PR if I ever get back to it, otherwise others are welcome to make some similar changes with the review feedback incorporated.