Closed mishak87 closed 7 months ago
hey, thanks!
I was able to repro 5x improvement in encoding.
however, I still don't get it how you speed up decoding with arrays. Can you share code/PR/Gist/repo how you do it?
so, far I don't see good way to switch to arrays based encoding/decoding without: 1) requiring users to specify values contigiously and starting from 0; and 2) adding reading numeric value from var/const declaration AST statement. So looks like this is no go.
Feel free to commander or add your PR like this, or leave comments in PR by 2024-04-09.
I was thinking about how I use enums. And all the time I use them to optimize two things: memory footprint and speed of comparisons.
In my code I needed only uint8 and fast encoding/decoding to database/json and text. So I am thinking about extending stringer into an enumer that adds that functionality and wraps in type Enum struct { v uint8 }
.
Source code for stringer is interesting. Author decided arbitrary cutoff of more than 10 contiguous intervals when it uses map else it is comparing integer and using appropriate slice of string per interval. One string per interval optimization is used to cut down number of compiled strings to one
however, I still don't get it how you speed up decoding with arrays. Can you share code/PR/Gist/repo how you do it?
# pseudo code var A, B, C = enum{}, enum{1}, enum{2} var enumStrings = []string{"a", "b", "c"} var enumAll = []enum{A, B, C} // stringer uses single string and indexes; performance might be slightly slower due to double lookup var enumStringerString = "ABC" var enumStringerIndexes = []int{0, 1, 2, 3}
func (e enum) UnmarshalText(value []byte) error { // TODO optimize len(value) == 0 => set Undefined if enabled // TODO fast path len(value) > longest enum value => ErrBadValue // TODO optimize skip Undefined value and start from 1 // NOTE s := string(value); and for (...) { if allValues[i] == s ... } is slower than type casting on every iteration for (i := 0; i < len(allValues); i++) { // if enumStringerString[enumStringerIndexes[i], enumStringerIndexes[i+1]] == string(value) { if allValues[i] == string(value) { e = enumAll[i] return nil } } return ErrBadValue }
~~I have not tested how contiguous integers would perform against the map but there will be cutoff where hashing the key and getting it will be faster than N range checks and one for loop for that interval.~~
EDIT: There is no benefit of checking multiple ranges and would be better to always generate array of all values and array of all strings.
EDIT: I am not sure if it could be made significantly faster than `for` loop for small number of enum values. Maybe writing tree of if statements or switches checking specific bytes before comparing to a most likely value.
I could not reproduce this. in my benchmarks this solution (loop over array) leads to slightly faster encoding, but slower decoding.
name old time/op new time/op delta
MarshalText_Color-16 10.3ns ± 0% 7.5ns ± 0% ~ (p=1.000 n=1+1)
UnmarshalText_Color-16 11.5ns ± 0% 13.8ns ± 0% ~ (p=1.000 n=1+1)
name old alloc/op new alloc/op delta
MarshalText_Color-16 0.00B 0.00B ~ (all equal)
UnmarshalText_Color-16 0.00B 0.00B ~ (all equal)
name old allocs/op new allocs/op delta
MarshalText_Color-16 0.00 0.00 ~ (all equal)
UnmarshalText_Color-16 0.00 0.00 ~ (all equal)
from my previous experience dealing with encoding/decoding enums, for large enum sets (256 values) map is significantly better than loop. it literally becomes O(1) vs O(N).
let's keep it map for now. but of course feel free to open PR that beats benchmarks. ideally keep in mind that code should be minimal.
Just to clarify my point.
I could not reproduce this. in my benchmarks this solution (loop over array) leads to slightly faster encoding, but slower decoding.
The fastest by a 8-9ns margin for 5 valid values without other optimizations is BenchmarkUnmarshalTextLookupString-32
. Using array []string
not [][]byte
and checking in for (x = 0; i < len(allStrings); i++) { if string(value) == allStrings[i] { return allEnums[i] } }
.
Using for range
compared to for (;;)
is slow because it allocates a variable and copies value into it on every iteration.
Internal map hash function can have few loops to compute hash so for small data it makes sense single array lookup on strings (constant type unlike []byte) is faster.
This is my follow up from r/golang Reddit I did some benchmarking.
Each unmarshal benchmark iteration parses all 5 values (4x 4-6 characters long and one
<empty>
) and one bad value (4 characters long) ie.abcd
,bcdef
,cdefg
,defg
,<empty>
andzabc
. Similar for each marshal benchmark iteration parses all 5 values.Map lookup is significantly slower than loop for 5 items. Converting bytes to string and comparing is faster than comparing bytes using bytes.Equal because bytes.Equal converts both arguments to string anyway.
Array index is 30x faster than map lookup. Map lookup in MarshalText() and String() functions are even 40+% slower than UnmarshalText.