uzh / triplerush

A distributed in-memory graph store.
Apache License 2.0
31 stars 11 forks source link

Implement Hu-Tucker Front-Coding dictionary #22

Open pstutz opened 9 years ago

pstutz commented 9 years ago

This paper suggests that it works very well for URIs/URLs: http://www.dcc.uchile.cl/~gnavarro/ps/sea11.1.pdf

pstutz commented 9 years ago

Created a placeholder at https://github.com/uzh/triplerush/blob/da0c296fa8eac8625572842f55c46bc4f69428c7/src/main/scala/com/signalcollect/triplerush/dictionary/FrontCodingDictionary.scala

pstutz commented 9 years ago

We should also look at Re-Pair from the paper above. If we're willing to sacrifice a bit of speed it's in a very good spot on the speed/memory tradeoff curve.

pstutz commented 9 years ago

Re-Pair reminds me of Sequitur (https://en.wikipedia.org/wiki/Sequitur_algorithm), which is very elegant.