rpm-software-management / rpm

The RPM package manager
http://rpm.org
Other
495 stars 359 forks source link

Phasing out obsolete crypto in rpm #1292

Open pmatilai opened 4 years ago

pmatilai commented 4 years ago

We need to come up with a plan how to deal with obsoleted crypto in rpm.

MD5 is practically gone long since and SHA1 is on its way out too, to the point that it's not necessarily even possible to calculate these algorithms anymore (eg MD5 on FIPS mode). Yet we still carry them in various more-or-less prominent and permanent places such as the MD5 header+payload digest, database indexes (RPMDBI_SIGMD5 and RPMDBI_SHA1HEADER), MD5 aliasing for pkgid, and SHA1 aliasing for hdrid, and so on.

Besides the currently obsolete things, new things need to be built with the mindset that all crypto will become obsolete over time, and avoid putting it into new places where it only gets in our way eventually.

DemiMarie commented 3 years ago

Besides the currently obsolete things, new things need to be built with the mindset that all crypto will become obsolete over time, and avoid putting it into new places where it only gets in our way eventually.

I suggest avoiding algorithm agility as much as possible. It is great in theory, but in practice, it leads to a bunch of extra complexity, which in turn causes exploitable vulnerabilities. The current header parsing code is already far too complex.

Instead, choose one ― and only one ― set of algorithms. Drop support for all the others. And change the file version number when an algorithm change is needed. That’s what signify, age, WireGuard, and most other new cryptographic protocols do.

Conan-Kudo commented 3 years ago

Besides the currently obsolete things, new things need to be built with the mindset that all crypto will become obsolete over time, and avoid putting it into new places where it only gets in our way eventually.

I suggest avoiding algorithm agility as much as possible. It is great in theory, but in practice, it leads to a bunch of extra complexity, which in turn causes exploitable vulnerabilities. The current header parsing code is already far too complex.

Instead, choose one ― and only one ― set of algorithms. Drop support for all the others. And change the file version number when an algorithm change is needed. That’s what signify, age, WireGuard, and most other new cryptographic protocols do.

This would be a pretty bad idea for any archive format. That would mean we'd be making incompatible revisions of RPM very frequently, which would cause a whole host of problems as things move forward. Algorithm agility is the only reason that new algorithms are easily adopted at all. If you decide to make algorithm selection part of the protocol/file format itself, you wind up with a different trap: the inability to upgrade. I think we can all agree that is a much worse outcome.

Your examples are also sufficiently immature that there is no way that any of them have seen the consequences of those choices. WireGuard itself was only mainlined fairly recently, and has no current plan for dealing with the inevitable issue of cross-version (thus incompatible) client and server mixes existing in production. As for signify and age, any project from OpenBSD winds up being problematic because longevity, compatibility, and accessibility are not virtues held by the OpenBSD project.

It is important to recognize that security enhancements need to be balanced with usability and accessibility, otherwise nobody will use either for long. RPM has also been around for 25 years, and until very recently, all RPMs produced in that timeframe were still accessible by the latest version of RPM.

DemiMarie commented 3 years ago

Besides the currently obsolete things, new things need to be built with the mindset that all crypto will become obsolete over time, and avoid putting it into new places where it only gets in our way eventually.

I suggest avoiding algorithm agility as much as possible. It is great in theory, but in practice, it leads to a bunch of extra complexity, which in turn causes exploitable vulnerabilities. The current header parsing code is already far too complex. Instead, choose one ― and only one ― set of algorithms. Drop support for all the others. And change the file version number when an algorithm change is needed. That’s what signify, age, WireGuard, and most other new cryptographic protocols do.

This would be a pretty bad idea for any archive format. That would mean we'd be making incompatible revisions of RPM very frequently, which would cause a whole host of problems as things move forward. Algorithm agility is the only reason that new algorithms are easily adopted at all. If you decide to make algorithm selection part of the protocol/file format itself, you wind up with a different trap: the inability to upgrade. I think we can all agree that is a much worse outcome.

That’s understandable, and something I had not considered. Much of the complexity of the current format is not actually due to algorithm agility.

That said, versioned protocols do not prevent backwards compatibility. For example, v1 of a file format might use RSA signatures, v2 Ed25519, and v3 a hybrid of Ed25519 and CRYSTALS-KYBER (a post-quantum scheme). An implementation can support more than one version.

Your examples are also sufficiently immature that there is no way that any of them have seen the consequences of those choices. WireGuard itself was only mainlined fairly recently, and has no current plan for dealing with the inevitable issue of cross-version (thus incompatible) client and server mixes existing in production. As for signify and age, any project from OpenBSD winds up being problematic because longevity, compatibility, and accessibility are not virtues held by the OpenBSD project.

I suspect that the upgrade plan is to run both v1 and v2 services for a while, and eventually stop the v1 service.

It is important to recognize that security enhancements need to be balanced with usability and accessibility, otherwise nobody will use either for long. RPM has also been around for 25 years, and until very recently, all RPMs produced in that timeframe were still accessible by the latest version of RPM.

That’s understandable.

Conan-Kudo commented 3 years ago

Besides the currently obsolete things, new things need to be built with the mindset that all crypto will become obsolete over time, and avoid putting it into new places where it only gets in our way eventually.

I suggest avoiding algorithm agility as much as possible. It is great in theory, but in practice, it leads to a bunch of extra complexity, which in turn causes exploitable vulnerabilities. The current header parsing code is already far too complex. Instead, choose one ― and only one ― set of algorithms. Drop support for all the others. And change the file version number when an algorithm change is needed. That’s what signify, age, WireGuard, and most other new cryptographic protocols do.

This would be a pretty bad idea for any archive format. That would mean we'd be making incompatible revisions of RPM very frequently, which would cause a whole host of problems as things move forward. Algorithm agility is the only reason that new algorithms are easily adopted at all. If you decide to make algorithm selection part of the protocol/file format itself, you wind up with a different trap: the inability to upgrade. I think we can all agree that is a much worse outcome.

That’s understandable, and something I had not considered. Much of the complexity of the current format is not actually due to algorithm agility.

That said, versioned protocols do not prevent backwards compatibility. For example, v1 of a file format might use RSA signatures, v2 Ed25519, and v3 a hybrid of Ed25519 and CRYSTALS-KYBER (a post-quantum scheme). An implementation can support more than one version.

Right, but the issue is that there is no possibility of forward compatibility. Right now the RPM format is fairly good at forward and backward compatibility. Barring specific usage of features RPM can't handle (which is gated with specific details encoded in the RPM header), RPMs produced by newer versions can be installed by older versions. And some "new" features can be ignored by older versions of RPM without penalty (e.g. weak dependencies on RHEL 7 are ignored without issue).

At least in this specific case, we already encode what crypto is being used in the RPM archive and if RPM doesn't know what it is, it won't do anything. That's effectively the same as versioned protocol, it's just a versioned attribute in the format without changing the whole format. And that gives the opportunity to backport the support to older systems, which has been done before in other circumstances.

Changing the whole format for this would make it pretty difficult to do that, since that would increment the RPM major version.

Your examples are also sufficiently immature that there is no way that any of them have seen the consequences of those choices. WireGuard itself was only mainlined fairly recently, and has no current plan for dealing with the inevitable issue of cross-version (thus incompatible) client and server mixes existing in production. As for signify and age, any project from OpenBSD winds up being problematic because longevity, compatibility, and accessibility are not virtues held by the OpenBSD project.

I suspect that the upgrade plan is to run both v1 and v2 services for a while, and eventually stop the v1 service.

If WireGuard worked purely in user-space, that would be possible. But it doesn't. It's a kernel-level security service, and that means that in order to do what you say, you have to run different versions of the Linux kernel, meaning older Linux distributions. This is a serious downside of the WireGuard system as it currently stands.

DemiMarie commented 3 years ago

Besides the currently obsolete things, new things need to be built with the mindset that all crypto will become obsolete over time, and avoid putting it into new places where it only gets in our way eventually.

I suggest avoiding algorithm agility as much as possible. It is great in theory, but in practice, it leads to a bunch of extra complexity, which in turn causes exploitable vulnerabilities. The current header parsing code is already far too complex. Instead, choose one ― and only one ― set of algorithms. Drop support for all the others. And change the file version number when an algorithm change is needed. That’s what signify, age, WireGuard, and most other new cryptographic protocols do.

This would be a pretty bad idea for any archive format. That would mean we'd be making incompatible revisions of RPM very frequently, which would cause a whole host of problems as things move forward. Algorithm agility is the only reason that new algorithms are easily adopted at all. If you decide to make algorithm selection part of the protocol/file format itself, you wind up with a different trap: the inability to upgrade. I think we can all agree that is a much worse outcome.

That’s understandable, and something I had not considered. Much of the complexity of the current format is not actually due to algorithm agility. That said, versioned protocols do not prevent backwards compatibility. For example, v1 of a file format might use RSA signatures, v2 Ed25519, and v3 a hybrid of Ed25519 and CRYSTALS-KYBER (a post-quantum scheme). An implementation can support more than one version.

Right, but the issue is that there is no possibility of forward compatibility. Right now the RPM format is fairly good at forward and backward compatibility. Barring specific usage of features RPM can't handle (which is gated with specific details encoded in the RPM header), RPMs produced by newer versions can be installed by older versions. And some "new" features can be ignored by older versions of RPM without penalty (e.g. weak dependencies on RHEL 7 are ignored without issue).

At least in this specific case, we already encode what crypto is being used in the RPM archive and if RPM doesn't know what it is, it won't do anything. That's effectively the same as versioned protocol, it's just a versioned attribute in the format without changing the whole format. And that gives the opportunity to backport the support to older systems, which has been done before in other circumstances.

Changing the whole format for this would make it pretty difficult to do that, since that would increment the RPM major version.

That’s a good point. My main worry is that there could be a parsing bug in RPM that either:

Your examples are also sufficiently immature that there is no way that any of them have seen the consequences of those choices. WireGuard itself was only mainlined fairly recently, and has no current plan for dealing with the inevitable issue of cross-version (thus incompatible) client and server mixes existing in production. As for signify and age, any project from OpenBSD winds up being problematic because longevity, compatibility, and accessibility are not virtues held by the OpenBSD project.

I suspect that the upgrade plan is to run both v1 and v2 services for a while, and eventually stop the v1 service.

If WireGuard worked purely in user-space, that would be possible. But it doesn't. It's a kernel-level security service, and that means that in order to do what you say, you have to run different versions of the Linux kernel, meaning older Linux distributions. This is a serious downside of the WireGuard system as it currently stands.

What I meant is that the WireGuard kernel module could support both versions of the protocol for a while, and eventually drop support for the obsolete version. I do not believe that WireGuard intends to support products that are no longer receiving updates.

pmatilai commented 3 years ago

It is important to recognize that security enhancements need to be balanced with usability and accessibility, otherwise nobody will use either for long. RPM has also been around for 25 years, and until very recently, all RPMs produced in that timeframe were still accessible by the latest version of RPM.

I don't remember anything in this regard in recent times. @Conan-Kudo , what are you referring to here?

@DemiMarie , nobody is going to disagree on header parsing code being ridiculously complicated. I streamlined it a lot in the 4.14.x cycle so that there's a) one code path (instead of three) b) install and signature check verify agree on whether something passing or not c) we can now signature check before loading the header to be checked

But that's getting off track. The thing is, there can never be "only one" set of algorithms in rpm. The initial design did just that, and that's why we're still forced to deal with MD5 as a required field in packages produced a decade after MD5 was declared obsolete. The rpm lifespan and the consequences it has is something very few people realize.

For example, with the simple header-only digests and signatures, it's not that big a deal if there is two or three generations of them with different algorithms. But per-file hashes are so expensive there can only be one, and when people need to build across different versions sometimes targeting a version released 15 years ago, it has to be configurable. It's a complex tradeoff with tonne of historical baggage to be lugged, and what we need instead of "one true set" is a mechanism which allows us to deal with the inevitable churn over time - algorithms come and go, rpm the dinosaur stays :sweat_smile:

Conan-Kudo commented 3 years ago

I don't remember anything in this regard in recent times. @Conan-Kudo , what are you referring to here?

Ah, I was mistaken, we haven't ripped out RPM v3 format support just yet, we only deprecated it in ba385ec5b7f4340a4f9b6815efd0f1a9521a0b15. But removal of LSB/v3 support is coming...

pmatilai commented 3 years ago

Okay, in that case we agree :smile:

I think the "nice" way of killing v3 support is letting the obsolete crypto those packages use make it effectively uninstallable due to being unverifiable. That would actually already be the case, if it wasn't for the MD5 header+payload digest being the only available non-signature means of verification for the payload in much of rpm 4.x too, all the way up to < 4.14. It's configurable already though.

DemiMarie commented 3 years ago

@pmatilai we can also drop support for parsing v3 packages, which will help reduce our attack surface.

DemiMarie commented 3 years ago

But that's getting off track. The thing is, there can never be "only one" set of algorithms in rpm. The initial design did just that, and that's why we're still forced to deal with MD5 as a required field in packages produced a decade after MD5 was declared obsolete. The rpm lifespan and the consequences it has is something very few people realize.

That is actually a very good point. The version protocol approach would be to drop support for old versions of RPM soon after new versions were released. That is incompatible with long-term archival storage, for example.

For example, with the simple header-only digests and signatures, it's not that big a deal if there is two or three generations of them with different algorithms. But per-file hashes are so expensive there can only be one, and when people need to build across different versions sometimes targeting a version released 15 years ago, it has to be configurable. It's a complex tradeoff with tonne of historical baggage to be lugged, and what we need instead of "one true set" is a mechanism which allows us to deal with the inevitable churn over time - algorithms come and go, rpm the dinosaur stays

Indeed. The requirements of the enterprise world are extremely different from those of fast-moving open-source projects, technology companies, and websites. RPM is not in a position to say, “Here is a new (incompatible) major version; the old version will be EOL one year later.”. That also explains why X.509 and PKCS#7 have been brought up: they support features such as revocation and countersignatures, which are extremely useful for long-term archiving.

My suggestions would be:

rpm-maint commented 3 years ago

On 2021/01/05 03:17, Neal Gompa (ニール・ゴンパ) wrote:

But removal of LSB/v3 support is coming...


I was wondering what the cost would be for NOT deleting the old format, but leaving it in. The reason I ask is that it seems the rpm format is fracturing.

This makes it harder for one distro to read/use rpm's for another vendor as well as being able to read or install older or newer rpm's by a different rpm.

if newer rpm's could read older and newer rpm's it would help that compatibility.

In the above case, does that mean opensuse will no longer support the linux standard base or that rpm will no longer support it?

It's not real clear to me what removing LSB/v3 support will entail.

pmatilai commented 3 years ago

Rpm v3 is 20 years obsolete, except nominally for LSB. For the cross-distro compatibility none of that matters one iota's worth. The big joke with this all is that no rpm version from the last 20 years produces output that is actually compatible with LSB.

DemiMarie commented 3 years ago

@pmatilai Would a good first step be to make this subject to system security policy?

pmatilai commented 3 years ago

I'm not sure what you mean by that. At least with the openssl-backend, whatever system policy is set is already honored - including FIPS, which in fact does cause v3 (and pre 4.14 built packages too) to fail to install in the default "digest" verifylevel as there are no usable digests to verify. People are running into this quite a bit in RHEL 8.

pmatilai commented 5 months ago

Obsolete crypto tags are gone from v6 packages in #3017 , what remains to be done is disabling validation on those by default.