rpm-software-management / rpm

The RPM package manager
http://rpm.org
Other
498 stars 359 forks source link

RFE: watermark short-circuit'ed binaries #3091

Open pmatilai opened 4 months ago

pmatilai commented 4 months ago

Split from a side-track of #3042:

Only with --short-circuit we "poison" the produced packages to prevent people from distributing them (accidentally or otherwise).

It is a misfeature. It means that the produced packages cannot be compared and tested properly. In particular, --short-circuit is very often used to tweak and test details of Obsoletes and Requires and such. Not being able to produce the package that looks exactly like the normal output makes the build not useful.

The whole idea of "prevent people from distributing them" doesn't make much sense. You cannot build a package with --short-circuit "accidentally". It's a very long option that you need to insert in the right place. And I guess "otherwise" means "maliciously" here, and that's even less useful, because the person doing the build has full control over what is built, so they don't need to use --short-circuit to achieve malicious goals.

Instead of using --short-circuit, people are forced to either wait for full package builds (which can be hours), or do dirty tricks like comment out part of the spec file. Those solutions are much worse (and much more likely to go wrong), than the problem being solved, i.e. people forgetting that they used --short-circuit and distributing those packages.

Please drop this whole protection and let people use --short-circuit without any limitations.

Originally posted by @keszybz in https://github.com/rpm-software-management/rpm/issues/3042#issuecomment-2074531384

Edit: turns out there are other ways to short-circuit a build besides just --short-circuit: --noprep and --build-in-place, both of which similarly violate the first rpm principle of binaries produced in a single uninterrupted run.

pmatilai commented 4 months ago

The whole idea of "prevent people from distributing them" doesn't make much sense. You cannot build a package with --short-circuit "accidentally". It's a very long option that you need to insert in the right place. And I guess "otherwise" means "maliciously" here

Obviously you can't use --short-circuit accidentally, the accident refers to distributing a binary built that way. Think of a lone developer uploading a binary built on their own system to the net for others to use. That's not as common these days as it once was, nowadays thankfully most people use actual build systems.

The "otherwise" doesn't refer to malice, but ignorance. There have been people wanting to distribute packages built with short-circuit, just to shorten their build times basically.

But 14 years later (7583fcc3416e5e4accf1c52bc8903149b1314145) and hopefully a bit wiser too: a gentler version would be simply to "watermark" short-circuited builds somehow. It doesn't have to be a install-breaking dependency, just something that you can check.

keszybz commented 4 months ago

Just a watermark would be much better than status quo.

There have been people wanting to distribute packages built with short-circuit, just to shorten their build times basically.

Actually, I don't think this would be so bad. There are countless ways in which somebody can mess up a package build. In particular, just put wrong files or badly compiled files in the package and there isn't much that the build system can do against that. If somebody is savvy enough to successfully set a build system that uses some form of caching and short-circuit, why would this be a problem? I think trying to prevent this is similar to trying to prevent somebody from using inappropriate build flags, i.e. not possible to actually implement and actually not useful.

pmatilai commented 4 months ago

The bad is that it disagrees with rpm design philosophy where the package goes from a source to a binary in one uninterrupted reproducible (in a sense) go. It's of course possible to circumvent that in any number of ways, but encouraging it by making it easy is a whole can of worms.

keszybz commented 4 months ago

I think we just see this a bit differently… I don't think it's "encouraging" to allow something to be done via an explicit option. The reason why I'd prefer to have no marking at all is that personally, most commonly I use short-circuit to do repeat builds while tweaking either the %install or %files sections or the Provies/Obsoletes/Conflicts sections and compare the results using rpmdiff and diffoscope. Injection of the marking is going to show up in those listings. Obviously it can be filtered out or ignored, but it's always an additional step to take, and it's be just more convenient to not have to do that. (Obviously, just a "watermark" is much better than the previous state where the rpms were not installable without --nodeps, making them unusable for many tests.)

pmatilai commented 4 months ago

Came up in #3120/#3121: apparently mock uses --noprep quite liberally as an optimization, but it should let the final -ba to produce "production binaries" happen from fresh set of sources. (@praiskup @hroncok) We should also watermark binaries built this way in the future. Ditto for --build-in-place which is handy for testing but nothing like a pristine build that rpm promises.

hroncok commented 4 months ago

Mock guarantees the "production readiness" and "reproducibility" of the result. Running the final -ba without noprep would gain no benefit to mock.

pmatilai commented 4 months ago

This is about rpm needing to guarantee its own operations, really. Like I noted in #3121 rpmbuild needs to be in charge of its own builds, otherwise things get very weird and screwy.

And, those dependency generator scripts can alter the build directory in arbitrary ways, mock cannot guarantee side-effects from those don't affect the build. So it really should let the last -ba run from fresh sources (and I should file this as a mock ticket of course).

hroncok commented 4 months ago

If they can affect the build, they can do it even in -ba. If rpm wants to guarantee its own operations, it should provide an API for the caller to handle %generate_buildrequires installations (e.g. via file sockets or pipes or whatever).

pmatilai commented 4 months ago

They can affect the build in a different way depending whether its run multiple times, there's no guarantee of indempotence there. And, I agree, rpm should provide a means to perform a continuous run in its own terms, but in the meanwhile --noprep let the cat out of the bag uncontrollably and prematurely.

praiskup commented 4 months ago

They can affect the build in a different way depending whether its run multiple times, there's no guarantee of indempotence there.

It would be nice if we started a separate Mock issue, not to steal the topic here. Maybe the related https://github.com/rpm-software-management/mock/issues/1358 (edit: fixed the link to a Mock, not an RPM issue). There are these premises:

If we repeat %prep, are not going to install anything new by Mock (buildroot unaffected), and we may trigger a %generate_buildrequires misbehavior, and thus well bring a "new variable" into the build process rather than stability/idempotence.

let the cat out of the bag uncontrollably and prematurely.

Normal evolution? :-) The %generate_buildrequires support has been happening for several years already.

Conan-Kudo commented 4 months ago

Note that Yocto already guts our poison logic from short-circuit builds, since the entire distribution is built short-circuited.

https://github.com/openembedded/openembedded-core/blob/master/meta/recipes-devtools/rpm/files/0001-Do-not-add-an-unsatisfiable-dependency-when-building.patch

pmatilai commented 2 months ago

Yeah, of course somebody could do just that, and inevitably somebody does :disappointed:

One possibility may be simply recording the entire rpmbuild cli line into the built package. That would be more of a generic forensics tool than condemning --short-circuit but has its own problems wrt reproducibility, whether a macro is passed on the cli or in a configuration does not currently affect the binary outcome but would if the cli is stored.

Conan-Kudo commented 2 months ago

Mock guarantees the "production readiness" and "reproducibility" of the result. Running the final -ba without noprep would gain no benefit to mock.

No, it does not. Mock just runs dnf and rpmbuild in repeating sequences. That doesn't guarantee any reproducibility. What guarantees reproducibility is a stable build environment definition. For Fedora, Koji provides that. Other distributions use their own infra (OpenMandriva's ABF, as an example) to do that.