Open winterland1989 opened 8 years ago
As documented, I consider this package to be deprecated (and not to be developed further).
The problem is that it uses the same module names as unordered-containers
. If you are interested in using the algorithms from hashmap
package, probably the best way would be to create a new package with new module names (not to clash with unordered-containers
).
But maybe I see it wrongly -- please do not hesitate to discuss your reasons here.
I would say keep same name and same API is actually a nice thing, user can easily switch package and do benchmark without changing code, even if someone really need use both, ghc aupport package qualification. What do you think?
Currently I am inclining on revamping the hashmap
(well, letting @winterland1989 do it as I do not have time nor motivation to do it myself).
@tibbe Do you have any thoughts on this? It seems people are using hashmap
, so it would make sense to work directly on hashmap
; but I am a bit worried about the same module names and possible confusions with unordered-containers
(but as people are using hashmap
, all the problems did not stop them using it).
OK, I'll try create a new package and do more benchmark to see if it worth sharing.
One more question, why use Some
not Map
directly as leaf? Is there some speed advantage?
If people want the different speed tradeoff of hashmap
I don't mind if the names collide (it might actually be helpful so they can switch back and forth).
@winterland1989 Ok, so if you are willing to do it, please go ahead. However, I have some remarks:
please try disrupting the existing API as little as possible (ideally not changing it at all).
Notably, please follow the containers
way of Strict/Lazy
split (using the same datatype, etc.), and then please export the Data.HashMap.Lazy
in Data.HashMap
(so that people can still use the Data.HashMap
; and if there are some functions in Data.HashMap
which are not in Data.HashMap.Lazy
, keep then in Data.HashMap
[this is done in Data.Map
, but I did not see such a function at a first glance])
the Some
datatype is there to save memory and increase speed. If there are no collisions, you only allocate one Only
node; if Map
is used, the Bin
is larger by three elements (Size, left Tip, right Tip) and you have to pattern patch the left and right child to see it is a Tip.
If you want, try benchmarking hashmap
without it, but my guess is that it will be better with Some
node.
An alternative would be to add the third node One
to Data.Map/Data.Set
representing a set of one element (I have benchmarks showing that it would increase performance [quite a lot for fold
, for example, where only circa half the nodes have to be traversed as the Tip
s on the leaves do not have to be traversed] and decrease memory complexity, but I never had time to implement it) -- if this happened, the Some
datatype could go away
OK, thanks for your advice! I'll come back to you when it's ready.
Apologies if this is the wrong place to ask this question: What were the reasons behind the deprecation in favor of unordered-containers
?
@sjakobi The unordered-containers
seemed to be a more complete and more managed alternative, with quite similar performance characteristics (compared to other containers), so I decided to let it have the Data.Hash(Map|Set)
name.
From
unordered-containers
's benchmark and other places, it seems hashmap is faster at inserting but slower at reading, maybe more suitable under some situations?I'd like to revmap this package with more benchmark and add
Strict/Lazy
modules, does that make sense? @foxik