Open pstutz opened 9 years ago
We should also look at Re-Pair
from the paper above. If we're willing to sacrifice a bit of speed it's in a very good spot on the speed/memory tradeoff curve.
Re-Pair reminds me of Sequitur (https://en.wikipedia.org/wiki/Sequitur_algorithm), which is very elegant.
This paper suggests that it works very well for URIs/URLs: http://www.dcc.uchile.cl/~gnavarro/ps/sea11.1.pdf