The existing findField operation was called for every element in the
schema; and iterated every field in the struct; effectively it does
m*n operations where m and n are typically close in number; or in
other words O(n^2).
Optimize findField by caching the field information the first time we
see a new reflect.Type. Not only does this make the operation run in
O(n) time, there's less string comparisons too now.
The results were as follows (note this is benchmarking an entire SpecificDatumReader.Read, not the findField operation itself, just to show how much findField was overwhelming the decoder time):
Something to note - even though hugeval is a much bigger struct than complex, the newFindField variant ran in roughly the same time as 'complex'; because the time is now linear to the size of the schema, not really much related to the size of the target struct.
Also, I was able to confirm in another test where I added fields one at a time that the growth is indeed exponential on the original function. Even with small structs; this results in around a 4x speedup.
This is actually only step 1 of optimizations I would like to do; storing the interpretation of a struct paves the way for:
Optimize the SpecificDatumReader to use a specific execution plan based on the schema and struct type.
Set up a similar concept for the SpecificDatumWriter
Hi @crast, thanks for submitting this PR! Let me know if you need to discuss some further optimizations or need some help with them please. I'll be willing to merge PRs like that. Thanks!
The existing findField operation was called for every element in the schema; and iterated every field in the struct; effectively it does m*n operations where m and n are typically close in number; or in other words O(n^2).
Optimize findField by caching the field information the first time we see a new reflect.Type. Not only does this make the operation run in O(n) time, there's less string comparisons too now.
I made a benchmark branch where I left both functions side-by-side and swapped them out for each other so that I could test various permutations and the growth rate.
The results were as follows (note this is benchmarking an entire SpecificDatumReader.Read, not the findField operation itself, just to show how much findField was overwhelming the decoder time):
Something to note - even though
hugeval
is a much bigger struct thancomplex
, thenewFindField
variant ran in roughly the same time as 'complex'; because the time is now linear to the size of the schema, not really much related to the size of the target struct.Also, I was able to confirm in another test where I added fields one at a time that the growth is indeed exponential on the original function. Even with small structs; this results in around a 4x speedup.
This is actually only step 1 of optimizations I would like to do; storing the interpretation of a struct paves the way for: