Closed funny-falcon closed 5 years ago
Alternative to #103
I didn't fix comments yet, that is why it is marked as [WIP].
Looks like it is ready.
The master branch is now passing.
@funny-falcon: Can you please update yours so we can have a green mark on this branch too?
Not indefinitely, it is bound by O(n), where n is number of (unique) keys. And constant is usually 2, but could be a bit larger (4). I don't think it is a real problem: dealing with returned values is usually more expensive than skipping deleted entries. CPython doesn't "fix problem", as far as I can read and understand https://github.com/python/cpython/blob/master/Objects/dictobject.c
Ruby's hash has also tracker for first entry, because Ruby's hash has shift
method, that returns and removes first element (while Python's dict.popitem()
returns last element). This way Ruby's dictionary could be used as LRU structure already.
shift
could be easily emulated for Python's dict, so if desired, then first element also could be tracked. But it doesn't speedup all iteration noticeably, it just speeds up fetching first element in some patterns, like LRU.
tests passed.
Note that iteration over unordered dictionary still suffer from skipping empty and removed elements. Therefore there is no much difference. Certainly, unordered dictionary need less frequent rebuilds than ordered in this pattern (delete-insert-delete-insert), but occasional rebuild still needed if key set is not stable (due to thumbstones).
There could be euristic on delete: if fill>capa/8*5 and used<capa/8 { growTable }
. I think, such euristic will fix most patological cases.
Is iteration is important? I have patch that improves start of iteration at the expense of 2-4% slower putItem.
I'd prefer for faster putItem, that is why I didn't push it.
If there is no complains, lets merge it on monday, ok?
Implements order-preserving dictionary a-la python 3.7.
[]*dictEntry
to[]dictEntry
, store entries in order and use index hash table for hash lookup. Memory ordering now relies: -- table is stored last after resize --index
is written after new entry is written, andfill
incremented thenWhile DictGetItem and DictGetItemBig shows improvement, I think there is no real difference, because it is
benchcmp -best
of 3 runs, and results varies much from run to run. Iteration is much faster because dictTable is not allocated for empty dict, and StopIteration exception is created with dict. Though without this optimization it is still a bit faster than master, but not that much.