Closed crowell closed 7 years ago
Usually insert will never fail by definition unless the hashtable is full or cant allocate memory. The add operation, is the one that will fail in thatcase. This is how the sdb api and many other hashtable implementations work too. I think its a known thing and i try to check if the value exists before setting it.
But i agree that this behavious must be different to avoid confussions or common problems
On 18 Jun 2016, at 01:15, Jeffrey Crowell notifications@github.com wrote:
https://github.com/radare/radare2-regressions/blob/master/unit/test_hashtable.c#L59
should inserts fail with a collision? how should they be handled, because the key is just hash, not an actual object, handling collisions in this way doesn't seem possible?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
well, the fact of the matter is that djb2 isn't a fantastic hashing function for strings.
printf ("%p\n",r_str_hash ("cb"));
printf ("%p\n",r_str_hash ("bC"));
0x540d64
0x540d64
provides a collision, it's super easy to find collisions. colliding keys will break sdb, and the more we rely on sdb, the worse it is.
Perhaps, key, hash, value should all be stored? This will take up more space but will allow for actual collision resolution.
nice catch. but %p is the wrong format string to use here.
this algorithm comes from java hash string algorithm, we can switch to another one that have other collisions. but we will always find collisions
On 19 Jun 2016, at 17:08, Jeffrey Crowell notifications@github.com wrote:
well, the fact of the matter is that djb2 isn't a fantastic hashing function for strings.
printf("%p\n",r_str_hash ("cb")); printf("%p\n",r_str_hash ("bC")); provides a collision, it's super easy to find collisions. colliding keys will break sdb, and the more we rely on sdb, the worse it is.
Perhaps, key, hash, value should all be stored? This will take up more space but will allow for actual collision resolution.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/radare/radare2/issues/5168#issuecomment-227002237, or mute the thread https://github.com/notifications/unsubscribe/AA3-loZ9R3tXDg-fIgnXtMylObNDX-8zks5qNVuBgaJpZM4I4zSO.
will be nice to have this fixed before 0.10.4. thanks for the effort :)
Can be closed ?
afaik @radare is working on this in the sdb branch, cause the collision is not longer an issue but others have arised
Nope dont close it
On 3 Nov 2016, at 00:03, Álvaro Felipe Melchor notifications@github.com wrote:
afaik @radare is working on this in the sdb branch, cause the collision is not longer an issue but others have arised
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
im afraid of merging this before having the update-sdb branch merged. so moving to 1.1
Now that the sdb branch is merged this issue should be "gone" as soon as we replace the raw usage of util/ht.c to use the new one. but this may introduce new regressions.
util/ht is still used in some places, we should use sdb's one
we have already issues for this
https://github.com/radare/radare2-regressions/blob/master/unit/test_hashtable.c#L59
should inserts fail with a collision? how should they be handled, because the key is just hash, not an actual object, handling collisions in this way doesn't seem possible?