SerenityOS / serenity

The Serenity Operating System 🐞
https://serenityos.org
BSD 2-Clause "Simplified" License
29.64k stars 3.15k forks source link

LibGfx/WebPWriter: Implement run-length encoding #24633

Closed nico closed 5 days ago

nico commented 5 days ago

This implements the start of lossless webp's compression scheme, which is almost, but not quite, entirely unlike deflate.

The green channel is now green-or-length, and has up to 280 entries, instead of up to 256 before. We now use the 40-entry distance code (even though it only ever stores 1s now).

Due to this, a few places change to taking spans instead of Array<256>s.

The spec only has the transform from prefix or distance code to value. The inverse prefix_decompose() in this patch is my own invention. I checked with a python script that it's a true inverse (see PR for the script).

We now look for back-references with a distance of 1, which is equivalent to run-length encoding. It's fairly fast to compute, but leaves compression on the table. Full window-based back references will be in a future PR.

We also still don't do color cache entries yet, but that should be fairly straightforward to add. (It will make the green channel larger than 280 entries.)

We still use a single global huffman table for the entire image. Doing one per tile should be doable with the organization we now have, and might also be in a future PR.

File sizes, and perf numbers on HEAD before this patch series (see previous commit for perf comparison to previous commit):

sunset-retro.png (876K): 
    1.7M -> 1.6M,
    25.3 ms ± 0.5 ms -> 27.5 ms ± 0.8 ms 

(helps little; from 1.94x as input to 1.83x as large. About 5% smaller, for about a 10% slowdown.)

wow.gif (nee giphy.gif) (184k): 
    3.9M -> 1.4M
    105.7 ms ± 1.7 ms -> 74.0 ms ± 1.1 ms

(from 21.2x as big as the gif input to 7.6x as big. About 64% smaller, for a 28% speed up.)

7z7c.gif (11K): 
    40K -> 8.4K
    13.9 ms ± 0.6 ms -> 12.9 ms ± 0.5 ms

(from 3.6x as big as the gif input to 0.76x as big :^) About 79% smaller, for a 7% speed up.)

nico commented 5 days ago

The python script:

% cat table.py
#!/usr/bin/env python3

def v(i):
    if i <= 3:
        return i + 1, 0
    extra_bits = (i - 2) >> 1
    offset = (2 + (i & 1)) << extra_bits
    return offset + 1, extra_bits

def inv_v(v):
    assert v > 0
    v -= 1
    if v <= 3:
        return v
    i = 2 * (v.bit_length() - 1)
    extra_bits = (i - 2) >> 1
    if v >= (3 << extra_bits):
        i += 1
    return i

for i in range(40):
    start, nbits = v(i)
    if nbits == 0:
        print(f'{i:2}  {0} {start}')
    else:
        print(f'{i:2} {nbits:2} {start}..{start + (1 << nbits) - 1}')

    for t in range(start, start + (1 << nbits)):
        if i != inv_v(t):
            print('wrong: ', i, t, inv_v(t))
% python3 table.py
 0  0 1
 1  0 2
 2  0 3
 3  0 4
 4  1 5..6
 5  1 7..8
 6  2 9..12
 7  2 13..16
 8  3 17..24
 9  3 25..32
10  4 33..48
11  4 49..64
12  5 65..96
13  5 97..128
14  6 129..192
15  6 193..256
16  7 257..384
17  7 385..512
18  8 513..768
19  8 769..1024
20  9 1025..1536
21  9 1537..2048
22 10 2049..3072
23 10 3073..4096
24 11 4097..6144
25 11 6145..8192
26 12 8193..12288
27 12 12289..16384
28 13 16385..24576
29 13 24577..32768
30 14 32769..49152
31 14 49153..65536
32 15 65537..98304
33 15 98305..131072
34 16 131073..196608
35 16 196609..262144
36 17 262145..393216
37 17 393217..524288
38 18 524289..786432
39 18 786433..1048576