reific / braid

Library for Transparent Compression of Java Strings
GNU General Public License v3.0
1 stars 1 forks source link

Investigate "PPP Predictor Compression Protocol" as a Possible KnotStorage #2

Open jdscriven opened 9 years ago

jdscriven commented 9 years ago

This algorithm uses a 256x256 (or 512x256 or 768x256 or 1024x256) byte grid that is precomputed based on a corpus (such as english texts) to predict the next byte given the previous few bytes.

http://stackoverflow.com/a/1138856/651794 http://en.wikibooks.org/wiki/Data_Compression/Markov_models

jdscriven commented 9 years ago

There is theoretical max compression to 1/8 of original size, since compression is done byte-by-byte. That is, in the best case scenario, every byte will compress to a single bit. What if the next two or three bytes were predicted instead?