pulp / pulp_rpm

RPM support for Pulp Platform
https://pulpproject.org/pulp_rpm/
GNU General Public License v2.0
48 stars 124 forks source link

Support for linting RPM Packages on upload #3641

Open simenon opened 3 months ago

simenon commented 3 months ago

Is your feature request related to a problem? Please describe. In some cases it could be beneficial to check the uploaded RPM with a linter like rpmlint. This can enhance the quality of the RPM's being created and uploaded. Therefore it would be nice if we could provide a rpm linter to provide additional checks on upload and send the information back to the uploader

Describe the solution you'd like On upload of an rpm add an option to lint the package, optionally disallowing the upload of the RPM.

Describe alternatives you've considered RPM's can also be linted before upload, but there is no guarantee that the uploader does it in the workflow

Additional context rpmlint is a tool for checking common errors in rpm packages. It can be used to test individual packages and spec files before uploading or to check an entire distribution. By default all applicable checks are processed but specific checks can be performed by using command line parameters.

dralley commented 3 months ago

What defines "valid" in this instance? If it's not a valid RPM whatsoever then it won't be able to be parsed on upload. So presumably there's some intermediate state that is partially valid enough to be parsed correctly but wrong in some other way?

simenon commented 3 months ago

How does pulp validate that it is a valid RPM. If it is guaranteed that only valid RPM's can be uploaded then the use-case for an rpm-linter seems obsolete.

dralley commented 3 months ago

In order to upload an RPM into Pulp, Pulp has to extract metadata from the RPM header. If the RPM header doesn't exist or is not valid in a way that would prevent that metadata from being extracted, this will fail. So you can't just upload an ISO file or something as if it were an RPM, that would fail.

But if you have some other criteria for what might be wrong with an RPM (which is correctly parsed as an RPM) which is something that we could check on upload, then that can be considered.

simenon commented 3 months ago

A structural correct RPM which you can extract metadata from does not define a valid RPM. There is a reason that a tool as rpmlint exists. It provides additional checking on RPM's that goes beyond the checks that pulp does (see the config https://github.com/rpm-software-management/rpmlint/blob/main/rpmlint/configdefaults.toml).

I think there is a benefit to provide some kind of pre-upload and/or post-upload check like rpmlint.

The use-case would be to decline a RPM if it does not pass the rpmlint check, or generate a message to the uploader

ggainey commented 3 months ago

First, rpmlint isn't presecriptive like, say, black is - different users will use different configs. This would imply that we'd need a config per-repo (for example) that the content was intended for.

Second, content is deduplicated in Pulp. If you and I both upload the "same" RPM (ie has the same checksum), the first one "wins" and the second upload will cheerfully say "Already have that one Boss!" How does that interact with rpmlint if we have a different idea of correctness? What if I want to add the same content into three different repositories, each of which has its own lint-config?

rpmlint is a fine tool, but I'm not convinced that the added benefit of putting it "inside" Pulp (as opposed to be being part of a user's external "I am building RPMs" workflow) is worth the complexity of the potential edge-cases.

dralley commented 3 months ago

As discussed in the RPM meeting, if we were to do anything like this, instead of integrating with individual tools we might want to do something similar to git's "pre-commit hooks".

That would lend more flexibility for the user to define a workflow they need. Another example might be "require that all packages uploaded into X repo use a particular signing key"

simenon commented 3 months ago

linting is just one of those additional checks you might want to execute on an artifact. Hence the idea of a pre- or post-commit construct, where one is able to do their own custom checks, would be a potential valid addition to uploads.

I guess this is more related to artifacts in general, then just rpm artifacts.