Open dsisnero opened 8 years ago
Thanks for the link!
I just looked at the page, and the "compact" representation which they show is actually very similar to what JVM Clojure uses for nodes which are less than half full. The idea of separate buckets for key/value pairs and subnodes is similar to what Hamster uses. I need to read the paper for details, though.
Hmm. One issue with using a packed representation for nodes, is that Ruby doesn't have a fast, built-in popcount (count number of 1 bits in a number) operator. JVM Clojure uses popcount to convert logical indices to physical indices in a packed node, and this new CHAMP representation would need the same.
I have created a gem called bit-twiddle
which provides a fast popcount operation for Ruby; we could use it, but it would need to also have a pure-Ruby version for JRuby/RBX.
The pseudocode for deletion in Steindorfer and Vinju's paper is faulty; still, the idea is clear. And it is a good idea.
An alternative to using bit-twiddle
would be for me to code up a C extension for Hamster, which replaces Trie
. I was planning to do this for the immutable-ruby
gem, but it doesn't seem to have much traction. Perhaps people like "Hamster" better?
I didn't know about immutable-ruby. What is the difference between that gem and hamster?
On Wed, Jan 13, 2016 at 5:49 AM, Alex Dowad notifications@github.com wrote:
An alternative to using bit-twiddle would be for me to code up a C extension for Hamster, which replaces Trie. I was planning to do this for the immutable-ruby gem, but it doesn't seem to have much traction. Perhaps people like "Hamster" better?
— Reply to this email directly or view it on GitHub https://github.com/hamstergem/hamster/issues/220#issuecomment-171281221.
What is the difference between that gem and hamster?
It's a fork of Hamster which uses a different namespace. Structures are Immutable::Hash
, Immutable::Set
, and so on. Otherwise everything is the same.
Just saw this and wanted to put a word in for pure Ruby options.
The Ruby RDF core is moving to Hamster
for its in-memory data storage in its 2.0 release (see the release notes), and is strongly invested in remaining pure Ruby.
Obviously, if the advantages promised in the paper materialize, we are :+1: to an optional C extension and/or bit-twiddle
with a Ruby-only option.
@alexdowad, with regard to naming, if you want to move, please put a big note at the top of the readme. Otherwise, people don't know what the difference is. I tried to figure it out on my own, and the only clue was that Hamster had more recent commits, so I figured this was the one to use.
@dreammaker, any discussion of immutable-ruby
can be done from its GH page. This location is for discussion of hamster
.
A new paper on HAMTs by Michael J. Steindorfer and Jurgen J. Vinju shows how to make HAMT implementations “lean”, meaning they consume less memory, and “efficient”, meaning speedier performance. The paper claims to make iteration and equality checks about 80 – 100% faster.
See http://bendyworks.com/leveling-clojures-hash-maps/ which links the paper.
Design the HAMT using this paper.