google-deepmind / language_modeling_is_compression

Apache License 2.0
101 stars 14 forks source link

how to compress a text with a language model? #3

Closed ArlanCooper closed 1 year ago

ArlanCooper commented 1 year ago

thanks for your great work! i have a small question to ask. can you give me a demo,which shows the way to use a language model to compress a text and decompress? thanks

anianruoss commented 1 year ago

Note that, due to numerical issues, when trying to compress and decompress, one needs to compute the token's pdfs separately for every proper subsequence of the input sequence. This has a time complexity of O(n^2) (whereas computing the pdfs in a single go is O(n)). See compressors/language_model.py for the implementational details.

ArlanCooper commented 11 months ago

thank you