Open billyquith opened 4 years ago
Personal opinion, grow facteur should be configurable.
But the default values could be better?
There is no such thing as better default values. It only depends on usage/mesure/allocation strategies... The value you want to change only shows that the default allocation grows doesn't feat the usage. In the real world you will want to reserve enough to avoid growth.
@billyquith Default values can be better, but you have to take performance as well as memory consumption into account. Your value is better for a simple reason: We have to allocate WAY less, especially on big programs. However, the memory cost is way higher. You could even set it to 20
and you'll only have to reallocate once for most programs, making it very "efficient". But the memory impact would be hard, especially for an embedded language where there might already be constraints .
Tuning these, here, probably mean benchmarking different values against A LOT of code. I've seen a couple projects where the best growth factor was 1.75, and they had taken into account the realloc algorithm, average map load and optimal memory consumption
The cost of resize (larger and smaller) and the associated copies and insertions of the values is significant. If noone has investigated what the best performance/memory trade off is then perhaps the first guess of the best values is not the best. There is a comment in the code to indicate this.
3 might be high, but the cost of resizing appears to be quite high in Wren. Other languages, like Lua, seem to have much more efficient table implementations.
map_numeric - wren .......... 1.73s 0.0912 120.43% relative to baseline - growth factor 3
map_numeric - lua .......... 0.22s 0.0057 12.96% - much faster
map_numeric - python .......... 1.72s 0.0743 99.46%
Because there is no answer, because there is no ideal generic situation to test against. In my program 3 might be a waste of memory. But in yours 20 might not be enough. If you really see/need an improvement in your program, just tune it in your program. Like I said the best growth, is no growth. So questing for the ideal value is futile and a waste of time.
@billyquith Beware of a comparison to Lua here: map_numeric
in the standard Lua C implementation is much faster because the Lua "map" (table) is implemented as a hybrid array/map. Since the keys are added sequentially from 1, the implementation only uses/grows the array portion.
Note the actual hash map portion only grows/rehashes when there are no other slots remaining. Lua tables do not have a load factor.
@benpop, Yes, thanks, I was aware of the dual hashtable implementation in Lua, and you're right the map_string benchmark would be a fairer comparison for the hash table. Wren works better with higher growth factor, Lua has a faster hash table, and Python 3.8 is slightly slower than Wren.
map_string - wren .......... 0.18s 0.0047 113.74% relative to baseline
map_string - lua .......... 0.14s 0.0027 76.37%
map_string - python .......... 0.17s 0.0063 94.44%
However, the map_numeric benchmark gives an idea for how much better performance can be with a different perspective or specific tuning. Perhaps Wren could benefit from dynamic arrays or a dual hashtable implementation.
@benpop
Note the actual hash map portion only grows/rehashes when there are no other slots remaining. Lua tables do not have a load factor.
That is an interesting observation but the Lua hash table keeps track of the free list, so its insertions are more optimal. Wren has lots of collisions (duplicate indices) whilst inserting, accessing, and resizing, especially in smaller tables. This is likely why aggressively growing the hash table is effective. This also assumes that the start index (modulo hash) has good distribution.
Lua also does more to try and avoid collisions by moving colliding nodes that are in inappropriate slots to available slots. Wren, on the other hand, has to traverse a large part of the entries when there are a lot of collisions, so its behaviour tends towards O(n) (a linear search), rather than O(1), which a hash table should be (ideally).
I experimented with some numbers and changing the
GROW_FACTOR
to 3 gives a 20% speed up. Other numbers seem to be less optimal. I assume higher numbers become a lot more inefficient when the table is very large due to initialising and copying to huge table.MAP_LOAD_PERCENT
seems to be good at 75%.https://github.com/billyquith/wren/commit/5b75dc301a39228d48373ef4e760e088325e743c
I ran the full benchmarks twice.
I wondered if anyone else had experimented with this.