rpm-software-management / mock

Mock is a tool for a reproducible build of RPM packages.
GNU General Public License v2.0
383 stars 232 forks source link

[RFE] Allow injecting BuildRequires during %prep #160

Closed nim-nim closed 5 years ago

nim-nim commented 6 years ago

git made it very easy to create/modify/fork projects, and to depend on other projects

Some software ecosystems such as Go (golang) built their software management tooling as a very thin layer over git, and have build requires can reach the hundreds, and change a lot over time.

They no longer maintain static descriptions of their build deps, those deps are computed dynamically at the last minute from the code state, and the Go native "package" manager deduces which git project needs importing from this state.

rpm and mock adapt easily to this model for Depends and Requires, but do not provide entry points to do the same for BuildRequires. Deducing BuildRequires outside rpm is an exercise in frustration as the list gets obsolete fast, you can't do it once and forget about it it really wants redoing for every update.

That make it hard to package in rpm projects such as Docker or Kubernetes which are written in Go

To manage this kind of software reliably, it would be necessary to allow BuildRequires injection at the end of %prep, when the project code state is known and available and it is possible to run a custom macro (that may call the native solver) to generate BuildRequires matching the code state expectations

That would not be a security risk as we'd not be downloading from the internet but requesting deps in audited package form already present in trusted repositories

That would not make package construction unreproducible as the computed BuildRequires depend on the code state which is fixed since the project archive is fixed

That would make build root construction a three stage process:

  1. default build root
  2. default build root + static BuildRequires documented in the spec (sufficient for the native solver needs)
  3. default build root + static BuildRequires documented in the spec + dynamic BuildRequires computed at the end of %prep
Conan-Kudo commented 6 years ago

This is related to https://github.com/rpm-software-management/rpm/issues/104

praiskup commented 6 years ago

This should really be implemented in RPM.

edit: Ah, yeah -> I see, this is complement to the RPM issue. So please let me rephrase.. Once this is in RPM, we need to add some functionality to reflect the RPM changes into mock.

xsuchy commented 6 years ago

Several issues here:

What I can do is to call dnf with --setopt=strict=0 which will result in:

# LC_ALL=C dnf install --setopt=strict=0 go(foo) zsh
Last metadata expiration check: 0:43:32 ago on Mon Feb 12 08:32:15 2018.
No match for argument: go(foo)
Dependencies resolved.
======================================================================================================================
 Package                 Arch                       Version                          Repository                  Size
======================================================================================================================
Installing:
 zsh                     x86_64                     5.4.1-1.fc27                     fedora                     2.8 M

Transaction Summary
======================================================================================================================
Install  1 Package

Total download size: 2.8 M
Installed size: 6.6 M

I can parse the No match for argument: go(foo) line and then try to execute go install foo. Similarly for perl, rubygem, ....

praiskup commented 6 years ago

This will never (likely) allowed in Fedora.

Why do you think so? Except the following point which I agree with.

I think that it is really bad to generate the list of dependencies. This way you really stop caring about them and bad things will happen. So I am opposed to parsing this somehow from %pre section.

nim-nim commented 6 years ago

Why do you think so? Except the following point which I agree with.

I think that it is really bad to generate the list of dependencies. This way you really stop caring about them and bad things will happen. So I am opposed to parsing this somehow from %pre section.

Well this is pretty much forced by the way software languages are evolving. If you think any packager will think deeply about a BR list longer than a a couple dozen lines, regardless of how it's constructed, you're dreaming (and even if they think at it once they won't do later, see also all the complains packagers are not removing lines no longer necessary from their specs)

At least permitting the automation of BR construction can also permit insertion of sanity checks, which are direly missing today.

xsuchy commented 6 years ago

@nim-nim We (engineers in Red Hat) cares about every dependency and we periodically check the status of those dependencies. Just saying. I highly recommend this funny article https://hackernoon.com/im-harvesting-credit-card-numbers-and-passwords-from-your-site-here-s-how-9a8cb347c5b5

@praiskup just my guess. And to be honest "never" is something like "within 2 years" :)

nim-nim commented 6 years ago

@xsuchy What I meant to write is that the content of a package is supposed to be checked by the packager of this package, not by the packagers of every package that use it. So a packager, that packages something that depends on a couple other packages, will take a look at those other packages, but if it depends on many dozens of other packages, he will have to trust the packagers of those other packagers or no new packaging would ever be done. (everyone checking everyone else would not scale)

That's why there is some difference, between automation that grabs anything from the internet, and automation, that builds BR lists, composed of packages already present in the repository

xsuchy commented 6 years ago

@nim-nim But that is how BR works and always worked. Packager never specifies full list of BRs. Only what he directly requires. Every packaging tool I know (including pip, gem) downloads and install transitive dependencies automatically.

nim-nim commented 6 years ago

@xsuchy the difference for Go is that the dependency tree is mostly flat, and the first level of deps (the one tools currently force specifying manually) can get huge

xsuchy commented 6 years ago

@nim-nim What a lot of people actually does is to create foo.spec.in and then have Makefile, which actually populates foo.spec.in and changes it to foo.spec where those generated parts are put.

nim-nim commented 6 years ago

@xsuchy That supposes a specific workflow, where a static dep list exists upstream. There is no such list upstream in Go projects, most of them wouldn't know what a spec file is (the few that bother with Linux as OS think Linux = ubuntu docker image) and as I wrote before it's supposed to be generated dynamically at the last mile by Go tools from the state of code imports in source code (and additional constrains).

Of course some projects do store the result of their dynamic generation in git but that's akin to delegating part of the build to someone else instead of going the full free software rebuild in Fedora.

xsuchy commented 6 years ago

BTW how do you generate that list of deps with go?

nim-nim commented 6 years ago

There is an option in the Go compiler to tell it to print all the bits a source code directory depends on and another to filter the elements provided by the standard Go library of the system. The currently proposed autorequires in Fedora just transforms the output in golang(foo) rpm requires. A simple autogoBR macro would do pretty much the same thing, with some optional manual filtering of the output by the packager in the spec file.

https://src.fedoraproject.org/fork/nim/rpms/go-compilers/blob/more-go-packaging/f/go.req#_149

People are working @google and @fedora to create something a bit more convenient (IIRC SUSE wrote its own utility already), but all systems just add filtering layers over the basic go compiler dep listing.

For example, the next-gen system Google is working on right now adds a constrains layer, so you'll have a project config file that will basically say 'if code analysis detects you need foo, constrain foo to be in this version/commit/git branch range'. So this file will list deps the project may or may not need, and it won't list unconstrained deps, and you'll still need last-mile code analysis to list the actual deps a project code state needs (it will also probably evolve to output different dep lists depending on the analog of configure --with/--without flags)

clime commented 6 years ago

@nim-nim: if you know about some specific way to generate build dependencies for Go packages, you may want to have a look at https://pagure.io/rpkg-util. It enables you to inject anything into a spec file before the actual build happens by using a macro. That's done by providing own rpkg.macros file with a shell function that outputs part of spec file to stdout. The utility is already being used for SCM Source Type in COPR if you pick rpkg as "srpm build method". You may want to try it out there.

nim-nim commented 6 years ago

@clime Thanks for the pointer but the use case is completely different, rpkg creates and requires a parallel repo per project and a parallel prebuild infra, the objective of this issue is to integrate missing parts in rpm and mock themselves to make possible to build from a plain spec file rpm and mock with no other infra requirement

msimacek commented 6 years ago

I didn't read the wole thread, but have you considered pm_request plugin [1]? It can make mock install packages during the build.

[1] https://fedoraproject.org/wiki/Projects/Mock/Plugin/PMRequest

nim-nim commented 6 years ago

@msimacek thanks for the pointer it looks interesting but unless I'm mistaken that requires writing a mock-specific client to talk to the plugin?

I'm really looking for something any packager can use with just a few lines in his spec file (a cli-oriented solution that allows wrapping language-specific tooling not something that requires integration in the language-specific tooling itself. Language-specific tooling is too much in flux for the ecosystems I target and they care too little about Linux distros for deep integration to succeed in the next years. A wrapper OTOH can be easily adapted or rewritten)

msimacek commented 6 years ago

Yes. Is that a problem? Your can make your wrapper talk to the plugin

praiskup commented 6 years ago

@msimacek thanks for referencing the plugin; IMO that's just it (I have small security concerns... but that's about fixing the socket protocol and is OT for this discussion).

Do you have some example "clients" for the plugin? I mean, I'd like to experiment with python a bit but I don't want to "research" what's already done.

msimacek commented 6 years ago

We use it in javapackages-tools (it's in python, don't worry), here's the relevant part: https://github.com/fedora-java/javapackages/blob/5c392d84c0bac2ab4d141bf316e910daafa734b3/python/javapackages/common/mock.py#L53

nim-nim commented 6 years ago

Anyway to get things rolling I wrote this:

https://github.com/nim-nim/mock-install https://copr.fedorainfracloud.org/coprs/nim/mock-install/ https://bugzilla.redhat.com/show_bug.cgi?id=1629371

Unlike javapackages-tools took care to extract the generic function so it can be used as-is by any packager, and integrated in any language-specific packaging macro.

nim-nim commented 6 years ago

So pm request is nice and works and thanks to everyone that suggested it.

However, getting it deployed is proving difficult. The socket it uses can be blocked in nspawn environments. People are unhappy the BuildRequires list is not materialized anywhere. It requires a specific client in the build root to talk to the socket.

So how about making mock use this simpler approach (from an integration POW):

That's basically what I've just suggested in: https://github.com/rpm-software-management/rpm/issues/104

Except the rpm maintainers have the choice to use a Tag, not a variable, and they can do all sorts or rpmbuild -bs magic mock can not do without them.

Removing the ability of builds to poke mock at random points, and moving to a system where mock reads the BuildRequires when he wants to, is also probably better from a security point of view.

xsuchy commented 5 years ago

Closing in favour of https://github.com/rpm-software-management/mock/issues/245