proposal: net/http: add support for RFC8878 `Content-Encoding: zstd` to `http.DefaultTransport`

Jorropo commented 1 year ago

Currently the default http transport will magically send Accept-Encoding: gzip header and handle decompression if the consumer is not handling that themselves. Chrome recently added opt-in support for Content-Encoding: zstd https://chromestatus.com/feature/6186023867908096.

I propose that the default transport sends Accept-Encoding: gzip, zstd and handles the decompression following the same logic that gzip have.

Implementation wise internal/zstd's performance sounds good enough, for a rough estimation I copied the BenchmarkLarge from internal/zstd and made a compress/gzip version, the compression ratio were 3.21 vs 3.24 respectively (very comparable) and on my Ryzen 3600 CPU the output's throughput are really similar:

name      gzip time/op    zstd time/op    delta
Large-12    3.08ms ± 4%    3.15ms ± 2%    +2.02%  (p=0.008 n=15+13)

name      gzip speed      zstd speed      delta
Large-12   184MB/s ± 3%   194MB/s ± 2%    +5.41%  (p=0.000 n=15+13)

name      gzip alloc/op   zstd alloc/op   delta
Large-12    7.54kB ± 0%    5.52kB ± 3%   -26.79%  (p=0.000 n=15+13)

name      gzip allocs/op  zstd allocs/op  delta
Large-12      40.0 ± 0%       0.0       -100.00%  (p=0.000 n=15+15)

Neither is a particularly high performance implementation, but if compress/gzip is good enough zstd should also be (except maybe for memory usage ?).

Adding internal/zstd to a binary adds 469KiB with 2186909d8651728cb71a4e02e0fe7b2c3d55e9b4. I guess this could be optimized by sharing some of the lz77 routines between gzip and zstd if this were a concern. This is true but missleading, most of it is reflect. Tested with (GOOS=linux GOARCH=amd64 GOAMD64=v3 and without stripping debug information):

earthboundkid commented 1 year ago

It would be weird to do this before internal/zstd is exposed publicly. Probably it should be moved to compress/zstd first.

Jorropo commented 1 year ago

I don't think so, see how the std handles http2 for example.

ianlancetaylor commented 1 year ago

CC @neild @bradfitz @dsnet

ianlancetaylor commented 1 year ago

I guess this could be optimized by sharing some of the lz77 routines between gzip and zstd if this were a concern.

They are really quite different, I don't think this would be reasonably possible. I also don't think it's a concern.

dsnet commented 1 year ago

Probably it should be moved to compress/zstd first.

Agreed. Ideally we would implement support for both compressing and decompressing in compress/zstd.

dsnet commented 1 year ago

I filed #62513 to add compress/zstd.

Jorropo commented 1 year ago

While zstd is a noble goal, I woul like to highlight that it still has zero support in browsers as of today. So for anyone looking to use a more efficient Content-Encoding for HTTP than gzip, brotli is still the go-to:

https://caniuse.com/zstd

https://caniuse.com/brotli

@silverwind

Brotli use more CPU than Zstd, Zstd also supports adaptative compression to efficiently use both the CPU and IO resources. Zstd is superior replacement to gzip such as for real-time compression which brotli does not efficiently do, brotli is still better in some other cases for now. In other words I belive we are heading in a world with applications that use zstd or zstd + brotli instead of gzip or gzip + brotli.

dsnet commented 1 year ago

adaptative compression to efficiently use both the CPU and IO resources.

That's technically an artifact of the implementation, not the compression format. Theoretically, gzip and brotli could do the same thing.

A major problem with Brotli is the use of a 120KiB static dictionary. This bars its use from any embedded application. It also forces any Go implementation to bloat by 120KiB.

seankhliao commented 1 year ago

note that since this proposal is for net/http client side support, what matters is server implementations of zstd (eg envoy, nginx ) , rather than browser support.

jimmyfrasche commented 1 year ago

could there be some kind of hook for adding content encodings so you could opt in to compression formats with large static dictionaries, like http.RegisterContentEncoding(brotli.Http)?

Jorropo commented 1 year ago

A major problem with Brotli is the use of a 120KiB static dictionary. This bars its use from any embedded application. It also forces any Go implementation to bloat by 120KiB.

I am extremely not convinced that 120KiB is a problem, see above internal/zstd adds 469KiB to the binary. It could be stored as a string literal so it would use pointers to a read only part of the binary which is often an mmaped disk region.

Even if you strip the binaries internal/zstd adds ~300KiB. So a dictionary wouldn't be a major contribution to the used storage or runtime memory.

dsnet commented 1 year ago

see above internal/zstd adds 469KiB to the binary.

A good chunk of that is due to transitive dependencies on "fmt" and "encoding/binary", both of which pull in "reflect". Modifying the code to remove those dependencies drops the linked in size to 100KiB.

renthraysk commented 1 year ago

adaptative compression to efficiently use both the CPU and IO resources.

That's technically an artifact of the implementation, not the compression format. Theoretically, gzip and brotli could do the same thing.

A major problem with Brotli is the use of a 120KiB static dictionary. This bars its use from any embedded application. It also forces any Go implementation to bloat by 120KiB.

Shared dictionary support for brotli is trialing in Chrome 117. And shared dictionaries for zstd trialing in 118.

https://github.com/WICG/compression-dictionary-transport https://chromestatus.com/feature/5124977788977152

silverwind commented 1 year ago

Personally I'm happy if any of the two "better than gzip" mechanisms gets support in the standard library.

By compressing go:embed content with zstd/brotli instead of gzip, I imagine the size of moderately large web app binary with compressed embeds will go down, assuming big enough assets. Now of course it would be awesome if go:embed would also support compression out of the box, but that's another topic.

ghost commented 1 year ago

Personally I'm happy if any of the two "better than gzip" mechanisms gets support in the standard library.

By compressing go:embed content with zstd/brotli instead of gzip, I imagine the size of moderately large web app binary with compressed embeds will go down, assuming big enough assets.

compress/zlib and compress/lzw are already "better than gzip" in some cases, whats wrong with those?

dsnet commented 1 year ago

I think @silverwind is referring to Brotli (RFC 7932) and Zstandard (RFC 8878) with regard to the two "better than gzip" algorithms.

compress/zlib is just a thin wrapper over compress/flate, which is the same underlying algorithm that backs compress/gzip. compress/lzw is a generally weaker algorithm than compress/flate.

seankhliao commented 7 months ago

Chrome has turned this on by default https://chromestatus.com/feature/6186023867908096

golang / go

proposal: net/http: add support for RFC8878 `Content-Encoding: zstd` to `http.DefaultTransport` #62492