Closed superbobry closed 9 years ago
Hi @superbobry,
I'm fine with adding __eq__
and __ne__
methods to Trie and other classes; pull requests are welcome :)
As for the hash, it looks like a consequence of a free list-based allocation indeed; two objects that are in memory at the same time are getting different hashes:
In [7]: from marisa_trie import Trie
In [8]: trie1 = Trie()
In [9]: trie2 = Trie()
In [10]: hash(trie1)
Out[10]: 281853959
In [11]: hash(trie2)
Out[11]: 281853967
I'll try to submit a PR during the weekend.
two objects that are in memory at the same time are getting different hashes
I can add __hash__
as well, if it's ok with you.
Thanks!
I think adding __hash__
is not necessary - what problem adding it solves? Current implementation doesn't e.g. affect dict collisions because hash is the same only for objects which are not alive at the same time.
Good point. Having a trie as a dictionary key doesn't make much sense anyway :)
There's one more edge-case: Trie
implements __getitem__
and doesn't override __iter__
. This (unfortunately) makes it iterable in CPython:
>>> t = Trie()
>>> iter(t)
<iterator object at 0x102939f28>
>>> for key in t:
... pass
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Argument 'key' has incorrect type (expected str, got int)
Is it possible to make the Trie
yield keys in __iter__
just like dict
does?
Ouch. Didn't know about that Python feature. Overriding __iter__
to return keys makes sense.
A trivial example:
There's one more interesting property: different tries seem to hash to the same value:
This might be due to a free list-based allocation, but anyway the behaviour is confusing.