dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
288 stars 136 forks source link

(secure) APT repo for dCache #2356

Open calestyo opened 8 years ago

calestyo commented 8 years ago

Hey.

it came up during the workshop that dcache.org may provide a APT repo with the dCache packages,... Let's use this ticket to collect ideas for this.

1) Software: There are several software systems for running a Debian repo. I have one at LMU and use debarchiver (https://packages.debian.org/sid/debarchiver), which offers secure APT (i.e. signed repo) and maintains the repo as a normal tree in the fs, which then needs to be exported, e.g. via HTTP, FTP, etc.. The intended way of bringing packages into debarchiver is via Debian's normal tools for uploading source packages (e.g. dput), but currently dCache doesn't have source packages. Luckily, one can simply put the binary packages into the file tree managed by debarchiver, run an update of that and it will pick up everything.

So the basic way of setting this up would be:

3) configuring debarchiver

An according ~user/.dput.cf of the user that would upload such packages would look like:

[lmu-lcg-tier-2-local]
fqdn = localhost
method = local
incoming = /srv/lmu-lcg-tier-2-package-archive/incoming
allow_unsigned_uploads = 0

if uploaded from the same host on where debarchiver runs.

#!/bin/sh

#initialise and secure the shell execution environment
unset -v IFS
PATH='/usr/sbin:/sbin:/usr/bin:/bin'

su --login --shell /bin/sh --command "debarchiver --autoscanall -x" debarchiver

3) SECURELY distribute your repo key to peoples (e.g. at a dcache workshop). Ideally use a 4096 bit RSA key. It doesn't need to expire.

People would add this key to their trusted APT keys with apt-key add.

2) repo suites, version numbers Debian has the concept of repo suites, e.g. the well known oldstable, stable, testing, unstable and experimental (or their respective codenames). Any other names can be freely configured in debarchiver. For us the Debian scheme doesn't make sense, as a) we have multiple "stable" versions (i.e. the most recent minors of all currently supported major versions) AND because we have no idea which major version a site wants to track. I would e.g. suggest to use something like the following suites:

One could think about additionally:

Or one one could even split "main"/"stable" into:

Having a suite per major would be possible e.g. "v213, v214, etc"... but that's ugly and has the problem that there is no notification, if you just no longer update a release... so bad idea.

In any case, I strongly suggest against having the version number, part of the package name. It's unclean, especially as in the dCache case it still wouldn't allow one to run multiple different versions on one node, as e.g. python2.7 and pythong3.5 packages (which contain the version as part of the name as well) do. So the package name should stay: "dcache" as it is.

3) apt_preferences(5) I mentioned before, that we cannot know which major (and minor) version a specific site wants to follow. Just because we'd upload a new major version (or even a new minor version),... these sites likely don't want to get and automatic update, however they probably still want to get notice in the package management that a new version is there, so that e.g. Icinga check_apt can tell them about security updates. A way to handle that is the apt_preferences mechanism, extensively documented in its manpage. APT is generally much more powerful than RPM, and one aspect of this is that system.... it basically allows to control how APT selects the candidate version... ranging from forbidding specific packages/versions over having them updated to a version to even having them automatically installed without any user interaction at all. I think we can perfectly use that to allow sysadmins tracking specific version. All we'd need to do is telling sites how to do so (examples) and that they likely "MUST" use this, because otherwise they get updated to the most recent 2.16.x version or whatever we add to the repo and is the highes version number.

It would look like this example file /etc/apt/preferences.d/dcache.conf

Explanation: Select the site's desired version of dCache.
Package: dcache
Pin: version 2.15.*
Pin-Priority: 600

to assign the priority 600 to all packages named dcache (from ANY repo in the system) and any versions of that match "2.15.". That would basically mean... once we add a newer minor version for 2.15,... it get's the same priority as the other 2.15. versions (one of which is already installed)... and thus the candidate version would be the most recent one of these (i.e. the currently added). This doesn't mean that APT will actually start to automatically upgrade, but it means, if the user says "apt-get upgrade"... then it will be upgraded.

If a user doesn't want that,.. simply select the exact version, e.g.

Explanation: Select the site's desired version of dCache.
Package: dcache
Pin: version 2.15.3
Pin-Priority: 600

Now only 2.15.3 would get the higher prio,... so even when there's a 2.16, or 2.15.9,... they would have lower prio, and not become the candidate, even though their version is higher.

As for the prios: The prios for the verisons from our repo is in our case 500 ... at least it should be when configured as explained above... how the repo-default prio is selected is actually a bit complex, read the manpage. It may be a good idea. to manually set a default prio for all packages/versions from our dCache repo, which would look like:

Explanation: Select default priority for all packages/versions from the dCache repository.
Package: *
Pin: release o=Ludwig-Maximilians-Universität München
Pin-Priority: 600

o being the Origin field as configured in debconf.conf

The prios have these meanings:

   P >= 1000
       causes a version to be installed even if this constitutes a downgrade of the package

   990 <= P < 1000
       causes a version to be installed even if it does not come from the target release, unless the installed version is more recent

   500 <= P < 990
       causes a version to be installed unless there is a version available belonging to the target release or the installed version is more recent

   100 <= P < 500
       causes a version to be installed unless there is a version available belonging to some other distribution or the installed version is more recent

   0 < P < 100
       causes a version to be installed only if there is no installed version of the package

   P < 0
       prevents the version from being installed

   P = 0
       has undefined behaviour, do not use it.

So normally for our users [500; 990[ and > default-prio is the way to go (which is why I choose 600,... but some may even want to use a higher one, to enforce downgrade of a locally installed newer version.

Obviously the whole priorities stuff and so on, would needed to be tried out,... I'd volunteer for that :)

Cheers, Chris.

calestyo commented 8 years ago

ping @jstarek

jstarek commented 8 years ago

Just wanted to let you know this is not forgotten; I just was busy with other stuff and will be on vacation until June. I'll comment here when I get around to setting it up.

calestyo commented 8 years ago

sure, no worries ;) have a nice holiday

calestyo commented 7 years ago

anything new on that? or is that dead and can be closed...

calestyo commented 6 years ago

anything new on that? or is that dead and can be closed...

paulmillar commented 6 years ago

Yes, certainly having yum and apt repos are something we want to support "at some point". I'm actually working on something right now that would facilitate this, in a somewhat orthogonal way.

For the crypto-signed packages part, it might be worth implementing, but I'm not sure where the benefit lies (where's the attack surface?). What risk do we mitigate from having a signed packages that is not already mitigated by downloading from a website with TLS and corresponding PKI? Especially given the high level of automation that would be needed for this signing to be workable.

calestyo commented 6 years ago

If you provide me a VM or so with Debian I can set up a repo server for you. All you'd need to do, then is put the binary packages (yes this works even without proper source packages ;) ) in some directory and voila. Needs Apache (or some webserver - but I can only configure apache).

Well... never trust X.509 PKI ;-) Seriously... (still: never trust X.509 PKI)... having signed packages (actually it's the Release files which are signed) is simply the native way for Debian to handle packages. Per default apt/aptitude/etc. won't even accept unsigned repos.

And while it can be done (having unsigned repos in Debian) and while one can have a https transport for apt, this would be pretty unsafe: Clients typically trust all CAs of some bundle (e.g. the Mozilla bundle)... and not just Deutsche Telekom, which is somehow the ancestor of of DESY's TLS certs... So the average user would also trust any possible packages from such "trustworthy" CAs as WooSign, CNNIC, Turktrust and more, which are all known to have previously issued forged certs (most likely for the governments from the repressive countries they come from). ;-)

The best thing would be to have a proper OpenPGP key for dCache, which could e.g. also be used to sign the git repos and give people a way to securely gather the sources.

paulmillar commented 6 years ago

So, basically it comes down to how to solve the 'public key distribution' problem. Note that the distribution of PGP/GPG public keys is also problematic, but with different characteristics and failure modes. So, overall, I'm not greatly convinced.

However, as GPG signing the packages is the expected behaviour (by apt et al), we'll make sure to add a package signing step by the time we implement our apt repo.

calestyo commented 6 years ago

First, I think in our "special community" it would be rather easy, to get keys distributed properly (i.e. trustworthy): Each year we have at least the dCache workshop and problably plenty more occasions where we meet each other (and so and so many sites and major users of dCache just join in).

And even if a user doesn't meet you personally for a proper key exchange than it's not less problematic since the distribution of X.509 CA certs, which all users get with their browsers, which they download from "somewhere", is also completely out of any trust path.

But even then, OpenPGP is much safer, as I've outlined before: Without a proper key exchange, OpenPGP would be still at least TOFU (trust on first use). While X.509 is trusting on each and every access again... and users will typically trust ~100 root CAs in the bundle... plus several thousands not less questionable intermediate CAs.

Put the theoretic questions of key distribution (in which OpenPGP clearly wins because you don't have countless of intermediate trust points in between, which are always on each and every access trusted again,... it simply goes down to: OpenPGP package signing is the native way for both APT and YUM. This already makes it the winner. ;-)

My offer still stands… give me a Debian stretch machine with root rights and which is accessible for me via SSH and the public via 80/443 ... and I'll set up a APT repo for you. All you guys need to do is: drop the packages there... and advertise it. I'm sure everyone will use it and one can just phase out the downloads website (at least for deb/rpm).

btw: such a machine should have some disk space (since I'd suggest that we put just all packages into the archive, so people can also downgrade if they want - my own APT repo has packages going back to 1.8 or when you started to make .debs)...

If it makes thing going I'd even try (no promises for success) to set up a YUM repo (the necessary stuff, createrepo, is packaged in Debian)... cause I guess I'm not the only dCache admin which hates downloading and distributing packages on n nodes.

calestyo commented 6 years ago

Oh and btw: If you now trust X.509, you could simply put any OpenPGP key on the website for download. Anyone who trusts X.509 and it's thousands of CAs now, will have no reason for lesser trust in the OpenPGP key downloaded from there. :-)

calestyo commented 6 years ago

ping @jstarek

rptaylor commented 5 years ago

I agree that the RPMs should be signed (this is generally universal practice for software these days).

The initial distribution of the public GPG key should not be too hard. Just make it available for download on the website and attest its authenticity by disseminating the fingerprint of the key (preferably the full-length fingerprint) in multiple trusted independent channels, e.g. via HTTPS on the dcache.org website, verbally and in person and on slides at dCache workshops, by email on the dCache mailing list, a broadcast from the EGI Operations portal (which I believe requires authn based on the DN of your browser's cert and authz based on your role in GOCDB) (note, although the broadcast is delivered by email it can also be viewed over HTTPS on the portal) , etc.

As @calestyo commented one important benefit is that the key is only imported once, and any subsequent change or verification failure would be immediately noticed - and, rightly, with alarm. Whereas, each and every time you access content via HTTPS it provides another opportunity for an attacker to potentially do a MITM.

In either case, it is game over if an attacker gets your private GPG key or private TLS cert. However, I believe one of the premises that typically motivates this approach is that attacking the TLS/PKI model that secures the content delivery channel (HTTPS) is easier than attacking the GPG signing key model that verifies the integrity of the content itself (the RPM). Since the web servers are by definition exposed to the web there is a larger attack surface, as well as the issue of disreputable CAs. On the other hand, one would imagine that a build system could (or should) be more internal and protected, and the compiled and signed software would be automatically transported from this protected internal system to the web server for public distribution. One could further imagine that the private GPG key(s) could be stored in an even more fortified location and/or protected with physical MFA device(s) (someone would have to push a button once a day to complete the automated build process, each team member could have a key and MFA device for redundancy, etc.).

As for the status quo (md5 hash served via HTTPS), it is better than nothing but md5 has been known to have security weaknesses for > 20 years and has been fully broken for any security-related purpose for 10 years. https://en.wikipedia.org/wiki/MD5#Overview_of_security_issues

rptaylor commented 5 years ago

As for a yum/apt repo, I agree this would be extremely useful as well, especially with regard to using configuration management. Getting Ansible or Puppet to install an RPM from a local file is a bit of a hassle on its own (which I believe is intentional, there are no good methods to do it that way because you're not supposed to do it that way). Also automating the process of downloading and verifying the RPM, or otherwise transporting it from a version control system, is more of a hassle. At that point it starts to become practical to set up your own yum repository so that you can automate your installations, but since many people use dCache it would be convenient if the RPMs are provided via a yum repo in the first place.

I know there are some dCache RPMs in the UMD repo but not sure how frequently they are updated.

calestyo commented 5 years ago

As for the status quo (md5 hash served via HTTPS), it is better than nothing but md5 has been known to have security weaknesses for > 20 years and has been fully broken for any security-related purpose for 10 years.

Well the downloads themselves are now TLS "protected" and the hash sums on the website are therefore mostly useless anyway... OTOH,... the trust model of X.509 is inherently broken,... so no real security

As for the repo... I wouldn't hold my breath to see this implemented ;-)

In the meantime....I for example run one locally, which basically contains all deb's ever released in its stable branch. The selection of the desired version then goes via a file like this:

# cat /etc/apt/preferences.d/dcache.pref
Explanation: “Select” the current version of dCache at the LMU LCG Tier-2.
Explanation: A Pin-Priority of “1000” shall be used when a downgrade is desired.
Package: dcache
Pin: version 4.1.16-1
Pin-Priority: 500

Explanation: “Disable” all versions of dCache.
Package: dcache
Pin: version *
Pin-Priority: -1

It's not perfect, but that's mainly because the dcache deb-archive isn't really suited for Debian in several aspects. Works however and should be trivial to get the same with one of the packaged archive managers (e.g. debarchiver) and some http server in front of it.

Cheers, Chris.

paulmillar commented 5 years ago

I believe the MD5 weakness is when the attacker can choose some portion of the content, which isn't the case here. That said, yes we should move the one of the SHA family of checksums.

calestyo commented 5 years ago

As said above, from a security point of view, any hash sums presented on the website are completely useless, regardless of the algorithm.

Their trust is solely based upon the trust in the X.509 cert hierarchy, which means for most users (who simply trust any cert by their browser) that ultimate trust is put in 150 different root CAs (including many from totalitarian countries and CAs who have been caught several times in forging certificates for their governments)... and several thousand intermediate CAs... all of which could issue a certificate for dcache.org. Most of these CAs base their "security" on some completely unauthenticated challenge/response checks (e.g. letsencrypt)... so anyone who can spoof DNS, whois, or IP routing can get a valid certificate for nearly whatever desired.

If security is desired, one should simply properly sign the packages with OpenPGP. Oh any developers should also use OpenPGP with git... ideally with all commits but to the least with release-tags.

Replacing MD5 with SHA* is security wise just a waste of time in this case.

Cheers, Chris.

PS: I think since quite a while (around 2010 or maybe even earlier?) there is a preimage attack on MD5...

rptaylor commented 5 years ago

My reasoning was: given that an attacker has successfully intervened as MITM by exploiting the X509 trust model (otherwise why would we be talking about using the hash as an extra layer of assurance), the attacker could indeed modify the content of the RPM, not only to inject malicious content but also to conduct a pre-image attack to conceal the modification. (Or the RPM could be modified by other means, e.g. breaking into the web server).

That being said, it appears that although the MD5 hash has cryptographic weaknesses, from what I could find it is still computationally impractical at this time to exploit the currently known preimage vulnerability.

Moreover, an attacker in MITM position with a forged certificate could of course just as easily present a different hash to the viewer.

All told my comment on the md5 hash was a bit of a digression, I hope it does not detract from the points regarding GPG signing and yum/apt repository.

calestyo commented 5 years ago

Sums alone (even if signed) are generally a bad thing to distribute something securely... it allows e.g. for replay or downgrade attacks and other such things.

That's why one should generally use something like secure APT which prevents this.