google / brotli

Brotli compression format
MIT License
13.3k stars 1.22k forks source link

Add library and cli flags for file format with embedded dictionary #1167

Open pmeenan opened 2 months ago

pmeenan commented 2 months ago

This is still in flight but I wanted to get some feedback from the tooling side before we go too far on the IETF spec for dictionary-compressed responses.

We are considering creating a new file/stream format that adds a 35-byte header before the compressed stream with a magic signature (DCB) and sha-256 hash of the dictionary that was used to compress the resource.

Currently the dictionary hash is sent in a separate header but there may be value in putting the hash in the file itself and removing the need for an extra header.

Optimally, if we go down this path it would be useful for the brotli cli and API's to support generating and decompressing these streams directly rather than wrapping their output in more tooling.

On compression:

On decompression:

Does this sound reasonable and make sense to add if we do go down the route of specifying a stream prefix for the dictionary-compressed streams? Are there any concerns/suggestions on the plan itself?