Closed MichaelOstermann closed 8 years ago
I agree. I remove them in V1 to be conservative and to better support some potential internal optimization (Such as #10, where you need to be able to recover the hash from just a key), but I think having these low level methods is one of the main appeals of HAMT over alternate implementations.
One other point of interest is that the Immutable library memoizes its hash computations to avoid having to perform them over and over again.
Ontrack for V1.1 update.
I would be cautious of memoizing stuff, this would be a sample implementation:
Function.prototype.memoize = function(){
var self = this, cache = {};
return function( arg ){
if (arg in cache) {
return cache[arg];
} else {
return cache[arg] = self( arg );
}
}
}
The main problem here is that there is one single cache object - one of the advantages of HAMT is that it scales well with big datasets since the trie is split up into smaller buckets, instead of one large array. In comparison, lookups in the cache
object will drastically decrease in performance the bigger it gets, especially since you always have 2 extra lookups here for each call, arg in cache
and cache[arg]
.
Restored in V1.1 .
This is mainly performance oriented, the main reasoning behind this is that through the use of
setHash
andremoveHash
you avoid unnecessaryhamt.hash
recalculations.The idea of
getNode
is the same asget
, but it returns the actual node instead, so you can access its hash value and further use it for subsequentsetHash
orremoveHash
calls.In my particular use case, I'm building an abstraction on top of HAMT, which converts raw objects to a trie and provides additional pure update methods such as
merge
,deepMerge
,filter
,concat
, etc.In the case of
deepMerge
for example, I have to recursively iterate through all the target nodes descendants and either update their value or remove them altogether, so adeepMerge
call potentially ends up with a lot ofhamt.set
orhamt.remove
calls, each one recalculating the hash of the key. I've made a small modification to HAMT (v0.1.9) wherehamt.get
returns the actual nodes with the already calculated hashes and I'm getting a visible performance increase through this method (in the context of millions of calls).What do you think?