rpm-software-management / librepo

A library providing C and Python (libcURL like) API for downloading packages and linux repository metadata in rpm-md format
http://rpm-software-management.github.io/librepo/
GNU Lesser General Public License v2.1
76 stars 92 forks source link

Instability when updating repo and using repo_gpgcheck #311

Open sdherr opened 6 months ago

sdherr commented 6 months ago

Description

There currently exists a timing window that can lead to client errors if you are using signed repo metadata with repo_gpgcheck=1. When you are using that option of course the server provides both the normal repomd.xml file and also a detached signature of it in repomd.xml.asc. Even if the server swaps out both files atomically on a repo update, an unlucky client can have received the old version of repomd.xml and the new version of repomd.xml.asc, if its requests happened to come on either side the update instant. The signature will then fail to verify and the client will return an error. Furthermore if the server is using any type of caching system or CDN, this mis-match can get persisted in the cache up to the TTL of the files if you are unlucky, causing many client errors and making the repo essentially unavailable for a period of time.

Problem

Unlike the other metadata files, there is no hash information in the name of repomd.xml.asc, or any other indication of what version of the file you want. This is likely because there is a chicken-and-egg problem of writing this info into repomd.xml: the file must exist first before you can sign it or calculate its hash, and the act of adding that information back in to repomd.xml invalidates the signature or changes the hash.

I furthermore assume that this is a detached signature in the first place instead of simply clearsigning repomd.xml to keep it one atomic file because there were worries about backwards-incompatibility with old clients.

Solution

Any change which would indicate which version of repomd.xml.asc that you need to the server is a potential fix, but I think there are three main possibilities:

  1. Changing the name of the file like you do with other metadata files, with the name calculated client-side somehow.
  2. Adding a header to the request.
  3. Adding a query parameter to the request.

In particular for backwards compatibility and ease-of-update, I suggest we do number 3, simply take the Last-Modified header value on the repomd.xml response, and add it as a query parameter on the repomd.xml.asc request: .../repomd.xml.asc?repomd_last_modified=<header value>.

  1. Almost every webserver simply ignores extra unexpected query params, so this should not break anything. On the off chance that it does and a server responds with a 4xx HTTP code, DNF could simply revert back to today's behavior and request the file without the query param.
  2. By contrast, caches / CDNs typically do include the query params as part of the cache key by default.

By including the information on which repomd.xml file you have in the cache key for repomd.xml.asc we are creating different cached versions of the file, and the client will automatically receive the correct one and proceed successfully if these cache keys are already populated. If there is no cache involved and a server is responding to every request directly, at least they are in no worse position than they were before, and they can teach their server to respond appropriately to the query param if they wish.

There is still a possible timing window here where bad pairs can get cached if both cache TTLs time out and a client re-fills the cache with an old version of repomd.xml and a new version of repomd.xml.asc?repomd_last_modified=123 from a server that is ignoring the query string. However this gives servers the ability to close this gap permanently and respond correctly to every request, by doing either:

  1. Teach the webserver to respond correctly to the query param, or
  2. Increase the cache TTL on */repomd.xml.asc?repomd_last_modified=* to more than double the TTL of repomd.xml, eliminating the overlapping timing window where it's possible to cache incompatible versions of the files.

Related

I have filed similar bugs against tdnf and zypper: https://github.com/vmware/tdnf/issues/475 https://github.com/openSUSE/zypper/issues/549

There also appears to be discussion about just in-lining the signature in repomd.xml and dropping repomd.xml.asc, which would be an alternate way to solve this problem: https://github.com/rpm-software-management/rpm-metadata/issues/1

jrohel commented 4 months ago

I have looked at the problem and propose 2 solutions. They both try for forward and backward compatibility.

1. Detached signature solution:

The repomd.xml file contains a revision element. By default, this element contains UNIX time. Example: <revision>1721803795</revision> We name the signature file <revision>-repomd.xml.asc. So for the above example, 1721803795-repomd.xml.asc.

Compatibility:

If the new file does not exist (old repository), the client will request the currently used repomd.xml.asc file as a fallback -> The new client will work with the old repository. The new repositories will continue to distribute the repomd.xml.asc file. So the old client will work with the new repository.

Note:

The user can define a custom value for the <revision> element (createrepo_c argument --revision=<something>), but as long as the revision is unique to a particular version/revision of the repomd.xml file it will work. In the worst case, we get the current situation.

2. Solution with inline signature

The signature will be added to the end of the repomd.xml file as a comment containing the signature element.

Example:

<?xml version="1.0" encoding="UTF-8"?>
<repomd xmlns="http://linux.duke.edu/metadata/repo" xmlns:rpm="http://linux.duke.edu/metadata/rpm">
  <revision>1721764685</revision>
...
</repomd>
<!--
<pgp_signature>
iQIzBAABCAAdFiEEpfSDzXM6TrrqN4sq6Il5+5swrPIFAmagC04ACgkQ6Il5+5sw
rPJe0Q//dm23fki6M6GZkV+oeVf+u9ZrCrLNfVXn1qN28bJUnV31azui0DBMoYYs
3RYGzUjBU6dgruWI5RjCxV1Icfs/7DWRvFSLhj9BO490X3nEXB9OaMjvaEL/NPnf
yQzQZ8PITYyxWSdimY/Kmdj+0hLk6NYqPwNZVbfa3+0KystWmVNcKkpdTMuBr4Gh
xXUDMasE9gTQQbiz3XFoG8MLrgAjyTL6FdGuuioN/Saq0c8zQ9G2do9jQNBhHy3B
7D2bK5s+JWIaKW95w8ASC11mx+1KHqA523+7ElI4ihY4mENCaVlYbpBlEPWP+4eq
gK/5Njnf1IVix0DtX69C9JFzC4bD/kEHqVNoUJuAtt66YuHd8lzjEHNRr95iqvwx
317+UJxqT0tFDbIaMrlWozufl8stY1+aNsPY2196k13vI7CqAhF3G1X5qH5rnFqA
mTI7YnxbOnCpVLNc0Y/TdSRuXX9r9H1Rj4edMG6Mcc9fnShGFwAcvKI5hma1+aj6
cTRHfq8PSHcX93+jVk3fpnJIJJ7WFkzcLEDmI5+Pin2V5lcqdQa6wrirvcnND75+
LPqKyJc398KrC8iGBW1AzRMbdj+nm4krPr/9ymZeayeWL4OMmEbCzXo/lBQfPpC2
aqldYWQIxpDyfgnXSK5IKg+51p1KA+dMUiI3DqSiQ8o8in/wPXY=
=KUBi
</pgp_signature>
-->

Compatibility:

If the comment with the signature does not exist (old repository), the client will request the currently used repomd.xml.asc file as a fallback -> The new client will work with the old repository. The new repositories will continue to distribute the repomd.xml.asc file. And the actual signature element is in the comment. So the old client finds the original signature file and ignores the comments. This ensures that the old client is functional with the new repository.

Note:

The signature in the repomd.xml.asc file will be different from the inline signature. Because it will sign the whole repomd.xml file including the inline signature.

sdherr commented 4 months ago

I have no problem with either of those solutions. I have a slight preference for 1, renaming the repomd.xml.asc file to include the revision, as that has a nice similarity to the existing metadata file names and seems simpler.

ppisar commented 4 months ago

I think in the 2nd solution with in-line signatures you are reinventing XML-Sig https://en.wikipedia.org/wiki/XML_Signature and believe me it's not a simple task (basically because of how physical XML model handles whitespaces, quotes etc.). I recommend detached signatures. If you felt compelled with in-line signatures, please use an existing library for XML-Sig, e.g. xmlsec.

jrohel commented 4 months ago

To be honest. I also prefer the detached signature solution.

But there is an old issue https://github.com/rpm-software-management/rpm-metadata/issues/1 suggesting inline signature support. So I wrote a proposal to achieve this relatively easily and compatible with older clients. Maybe I shouldn't have even mentioned this proposal.

I also expect that my proposal with detached signature may provoke criticism from some. The point is that first we need to parse repomd.xml to get the value of the revision element. This is needed to get the name of the signature file. So the repomd.xml file will be parsed before verifying its signature. Which is not a problem, if we then find that the signature does not match, we will not use the file further. However, someone may say. What if the xml parsing library contains a bug and someone replaces repomd.xml with another one exploiting this bug and we find out too late - we parse before verifying the signature.

But that's about understanding what the signature is supposed to do. Whether just to verify the content of the repodata or also to protect against exploiting bugs in the xml parsing library.

Maybe one day we will reach a point where the package manager (e.g. dnf) will be split into multiple processes and most of the work, including downloading data and processing/parsing it, will be done in a restricted unprivileged process.

ppisar commented 4 months ago

If you fear parsing untrusted XML, you can save the revision value into a new, plaintext file, let's called it "latest" and download that "latest" file first. Then repomd.xml and then $REVISION-repomd.xml.asc.

However, that won't safe you from the problem described in the original report: The client could download old "latest" file, then all files would get swapped, and then the client would download new, unrelated repomd.xml and get an error when retrieving nonexistent $REVISION-repomd.xml.asc.

At any rate, the client will result into an error and will need to decide whether to try a new mirror, to retry the same mirror, or raise an error to the user.

In my opinion, the problem does not have solution because there is no reliable way of downloading all files as existed at a time. The root cause is that a repository mirroring protocol (HTTP, file) we use is not a transactional database system.

In practical way, we should rather aim at better resilience of the client. I.e. retry the downloads if the signature does not match and raise an error, or try a different mirror, after a second unsuccessful verification.

The originally proposed solution with HTTP headers and URL query parameters is too specific to HTTP transport. We need to support other protocols.