google / rpmpack

rpmpack (tar2rpm) - package rpms in pure golang or cli
Apache License 2.0
116 stars 32 forks source link

Support zstd compression (and optional compression levels) #66

Closed erikgeiser closed 2 years ago

erikgeiser commented 3 years ago

I'd like to submit a PR to add Zstandard compression support. I already added it on a local branch an tested it with https://github.com/goreleaser/nfpm and Fedora 34 (see https://fedoraproject.org/wiki/Changes/Switch_RPMs_to_zstd_compression). Is this something you'd consider merging?

I'd use the zstd implementation from https://github.com/klauspost/compress/tree/master/zstd which is provided "using a Go standard license". While the algorithm stands for itself, the Go implementation is way faster than the Go implementation of xz which makes it a great choice going forward. If you want you can check out the performance benchmarks I made over here: https://github.com/erikgeiser/benchpress

Before submitting the PR, I'd like to ask if you would be fine with changing the Compressor setting to optionally include a level such as gzip:9, zstd:fastest or zstd:best. For gzip the level can be parsed with strconv.Atoi and zstd already provides an EncoderLevelFromString function. I would not support levels for xz and lzma as the settings are too complex. This way, users can better prioritize install speed vs. download speed or even build speed. If you do not like this, I'll submit a PR with a hard coded level for zstd (probably "best"/zstd.SpeedBestCompression as it's the closest to level 19 which is used by Fedora).

jarondl commented 2 years ago

When you say optionally, what exactly do you mean? And are you going to parse the string yourself? How?

Basically, yeah, as long as this won't make life complex for most users, we will be open to accept that.

erikgeiser commented 2 years ago

With optionally, I mean that the compression level is an optional addition to to the compression type string. Currently, for example, gzip assumes the hardcoded level 9. Similar to that, zstd would also have a default compression level. If a user specifies only the compression type (like simply gzip or zstd) these default levels would be used. If they want to customize the level they can use the optional level suffix (gzip:7 or zstd:fastest).

I'd parse the level inside rpmpack as it is really trivial (strings.Split() and then strconv.Atoi() for gzip or zstd.EncoderLevelFromString() for zstd).

Dumb question: I have never used bazel before. Would it be enough to add the following lines to deps.bzl?

    go_repository(
        name = "com_github_klauspost_compress",
        importpath = "github.com/klauspost/compress",
        sum = "h1:P76CopJELS0TiO2mebmnzgWaajssP/EszplttgQxcgc=",
        version = "v1.13.6",
    )
jarondl commented 2 years ago

I've replied in the pr. As for your question, it is definitely not dumb, most people in the world have not used bazel before :). The deps only pulls the other workspace, you also need to add it explicitly to the BUILD file. Take a look at how xz is included.