Closed wll8 closed 1 year ago
Hello. I am not sure what your question is or what you are trying to achieve. Could you maybe explain a bit more ?
It seems that this is a function of converting a string to only contain certain characters specified.
Example: Pseudocode
lz.compress(`hello`, {
map: `0123456789ABCDEF`,
});
The output of the above code is: 68656C6C6F
, they do not contain characters outside the map.
Can lz-string provide a way to alleviate the size increase by using the lz algorithm during the conversion process?
Here are some more examples:
pseudocode: looks like morse code
lz.compress(`hello`, {
map: `. -`,
});
// .... . .-.. .-.. ---
pseudocode: looks like Base16
lz.compress(`hello`, {
map: `0123456789ABCDEF`,
});
// 68656C6C6F
pseudocode: Array
lz.compress(`hello`, {
map: ["公正", "爱国", "平等", "诚信", "文明"],
});
// 公正爱国公正平等公正诚信文明公正诚信文明公正诚信平等
pseudocode: Array
lz.compress(`hello`, {
map: `👨👩👧`,
});
// 👨👨👧👨👨👧👨👧👩👨👧👨
Can lz-string provide a way to alleviate the size increase by using the lz algorithm during the conversion process?
This is very exactly what LZ-String is doing. It is using the lzw algorithm to compress the input, using the symbols in the array to store the bits the algorithm produced. So in effect, lz.compress('hello', {map: '0123456789ABCDEF'});
does not look like base16, it is a base16 representation of the compressed stream.
You have provided the following functions, but none of them meet the requirements, because in my use case only specified characters are allowed to appear in the compression result.
Transmission of characters other than 12AB is not allowed in my use case.
Suppose I need to compress BBC
, without lz-string, I might do this:
var userDict = `12AB`
var split = userDict.slice(0, 1)
var sysDict = userDict.slice(1)
var table = {
A: sysDict.repeat(1), // 2AB
B: sysDict.repeat(2), // 2AB2AB
C: sysDict.repeat(3), // 2AB2AB2AB
}
var table2 = Object.entries(table).reduce((acc, [key, val]) => (acc[val]= key, acc), {})
// the string to be compressed
var str = `BBC`
// ['2AB2AB', '2AB2AB', '2AB2AB2AB'].join('1')
var compressed = str.split(``).map(str => table[str]).join(split)
// compressed = '2AB2AB12AB2AB12AB2AB2AB'
var decompressed = compressed.split(split).map(str => table2[str]).join(``)
However, I managed to control the encoding result within the range of 12AB, but failed to compress the data. How can I get help with lz-string?
Right, sorry about that, I thought you already looked into the code since you described exactly the way LZString works. If you look at the implementation of compressToBase64, you see this line:
LZString._compress(input, 6, function(a){return keyStrBase64.charAt(a);})
The _compress
function takes three arguments:
So in your case you would call:
LZString._compress(input, 2, function(a){return "12AB".charAt(a);})
To do it in hexadecimal:
LZString._compress(input, 4, function(a){return "0123456789ABCDEF".charAt(a);})
Edit: Ah, and there is a corresponding _decompress
function obviously. Let me know if you need any more assistance with that one.
I saw the _compress method here a few days ago. Since the author raises some issues with transferring data, I think _compress is not the method I want. Now it looks like I was wrong.
Also, I think _compress is like a treasure trove. Worth showing in readme or homepage.
Although I don't understand why you need to pass a few extra parameters (whether it is possible to design a preset value?), or why _decompress should be charAt or charCodeAt, but I will try to understand it first.
Thanks for reminding me to use _compress again.
I am the author. It's compress
that has issues because it generates invalid UTF-16 characters that some JS engines fail at storing and retrieving. _compress
is the one doing the real job.
Although I don't understand why you need to pass a few extra parameters (whether it is possible to design a preset value?), or why _decompress should be charAt or charCodeAt, but I will try to understand it first.
The function you pass as an argument to _decompress
has one job: Providing the meaningful bits in the input string. If it's hexadecimal, it needs to give out 0 for '0', 1 for '1' etc up to 15 for 'F'. That's its job, so it has to read the input string, hence the charAt
. After that, you can do a switch
, a bunch of if
, a dictionary lookup (as in the decompressFromBase64
function) etc. It's really up to you.
Hey I am trying to add a custom compress/decompress function. I fail to understand the needed number given for _decompress.
For compress, the number of bits can be calculated as Math.ceil(Math.log2(dict.length))
But how do I calculate the resetValue? Math.ceil(dict.length/2)
?
dict
should be a string, e.g. "0123456789ABCDEF"
For example, add a parameter: dict, when the value is
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!#$%&()*+,./:;<=>?@[]^_`{|}~"
, it means use base91 compression.see: #90
It looks like base91, but it is not the standard base91 algorithm, but only uses the characters of these sets as output.
When the value is
hello lz
, it means that only some combinations of these characters will be included in the compressed text.