Open ypid opened 8 years ago
@marmarek, what do you think?
The risk with SHA-1 is that its collision resistance is dangerously weak. Its second pre-image resistance is still very strong.
Indeed git-evtag
looks interesting. Will look at it more closely
later.
We use tags for all the source verification anyway (not only for release
tags, but also intermediate pushes), so it should be very easy to
integrate it into our workflow.
Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
@ypid Note that OpenPGP signatures are also vulnerable to SHA1 second-preimage attacks (the adversary can forge keys with specific keyids at that point). As @CodesInChaos pointed out, second preimages on SHA1 are not an immediate concern.
https://shattered.it/ !! We should really revisit this sooner than later.
Is git-evtag the best recommendation at this point?
If so, thoughts adding docs about it to: https://www.qubes-os.org/doc/code-signing/ And enforcing it in: https://github.com/marmarek/signature-checker ?
@jpouellet Agreed. But note that this is not a preimage attack. But I think another attack is now practically possible. Consider an adversary submits a PR including a file generated with a cryptanalytic SHA-1 collision attack. The commit together with the authentic file get reviewed and enter the repository. Then when building Qubes OS, GitHub returns the forged version of the file. The OpenPGP signature and git tree will still be valid and the file enters a Qubes OS build. Ref Security challenges for the Qubes build process which at least makes a targeted attack on this pretty difficult. Note that finding such a collision is one thing, making it actually plausible and compromise Qubes OS is a different story I would say.
At least for the presented attack, it is possible to check if a file is generated with a cryptanalytic SHA-1 collision attack using the released sha1dcsum
tool. I just checked Qubes OS sources against this and did not find a single file like this (checkout and git history parts of git history (would need to unpack pack files first which I did not do, see comment below). find . -type f -print0 | xargs --null sha1dcsum | fgrep '*coll*'
(run in a Disposable VM of course :wink: ).
Edit: Improved command used for checking.
After looking into it all day, it does not appear to impact git's security immediately, except for targeted attacks against specific projects by very wealthy attackers. But we're well past the time when it seemed ok that git uses SHA1. If this gets improved into a chosen-prefix collision attack, git will start to be rather insecure.
Ref: https://git-annex.branchable.com/devblog/day_449__SHA1_break_day/
@ypid in case of git, you should not look at files but git objects (some are available directly in .git, but most of them are packed).
As for git-evtag, I've just given it quick look and besides crypto itself, git integration seems to be very incomplete. For example it support only creating/verifying tags from currently checked out sources, without local modifications. So no way to verify evtag before checkout (just after git fetch
). Also many arguments of git tag
are missing in git evtag
- for example it's impossible to create tag non-interactively (-m for message).
$ git evtag verify evtag1
error: Target d2ef17888a18316baebf2c2b68f516c689bdadfb is not HEAD (267c20cf8fb60490dcd32a682c92ff7b514e4ca1); currently git-evtag can only tag or verify HEAD in a pristine checkout
As a temporary solution, perhaps we might consider integrating the "collision detector" [1], presented by the authors of the attack, into qubes-builder? It's only not clear to me if these tool can be easily used on actual hashes, rather on files which are to be hashed?
[1] https://github.com/cr-marcstevens/sha1collisiondetection
@ypid Note that OpenPGP signatures are also vulnerable to SHA1 second-preimage attacks (the adversary can forge keys with specific keyids at that point).
I think it isn't a problem if you use actual key file (qubes-builder/qubes-developers-key.asc for example), not only key IDs. Also hash used in gpg signature is configurable - we use SHA2 already.
As a temporary solution, perhaps we might consider integrating the "collision detector" [1], presented by the authors of the attack, into qubes-builder? It's only not clear to me if these tool can be easily used on actual hashes, rather on files which are to be hashed?
Even if not, it shouldn't be hard to iterate over all git objects with git cat-file
.
I think it isn't a problem if you use actual key file (qubes-builder/qubes-developers-key.asc for example)
That should mitigate this.
Also hash used in gpg signature is configurable - we use SHA2 already.
Yes, for signing files, the hash is configurable. Not however for the OpenPGP key fingerprints as they use SHA1 as specified: RFC4880, 12.2. Key IDs and Fingerprints
@ypid Note that OpenPGP signatures are also vulnerable to SHA1 second-preimage attacks (the adversary can forge keys with specific keyids at that point).
I think it isn't a problem if you use actual key file (qubes-builder/qubes-developers-key.asc for example)
That should mitigate this.
Also hash used in gpg signature is configurable - we use SHA2 already.
Yes, for signing files, the hash is configurable. Not however for the OpenPGP key fingerprints as they use SHA1 as specified: RFC4880, 12.2. Key IDs and Fingerprints
I don't see the mitigation in that scenario. As you point out, the real danger of a SHA-1 second-preimage attack is the ability to create a key with the same fingerprint (not just key ID) as an existing key. @marmarek's suggestion is to use the actual key file, e.g., qubes-developers-key.asc
, but I don't see how that helps. Suppose I download that file. I now need to verify that the keys it contains are genuine. How can I do that? Presumably, by verifying that their fingerprints match the trusted fingerprints I've already obtained out-of-band. But this just reintroduces the original problem.
I integrated the sha1 collision check into git (experimental and hacky, but should work): https://github.com/HW42/git
It's only not clear to me if these tool can be easily used on actual hashes, rather on files which are to be hashed?
IIUC this is not possible since you need the intermediate hashing state which cannot be reconstructed from only the hash.
The above code simply replaces the git SHA1 function with the collision checking function and calls exit if suspicious data is hashed by git.
As a temporary solution, perhaps we might consider integrating the "collision detector" [1], presented by the authors of the attack, into qubes-builder?
Just using the "patched" git should do this. How to bootstrap is another question.
Also hash used in gpg signature is configurable - we use SHA2 already.
Yes, for signing files, the hash is configurable. Not however for the OpenPGP key fingerprints as they use SHA1 as specified: RFC4880, 12.2. Key IDs and Fingerprints
I think for our use case of GPG, collision attacks on the keys are mostly useless. The only scenario of which I currently can think is that some new core developer who gets their UID signed by the master key could have a colliding key with another UID. So they could forge the origin of commits a little bit. Therefore I think currently we can wait here until some "standardized" solution exists.
I think for our use case of GPG, collision attacks on the keys are mostly useless.
Exactly.
I integrated the sha1 collision check into git (experimental and hacky, but should work):
Great! Did you check it with all our repos? (in the fsck mode)? How long it takes?
@HW42 Nice work! In https://github.com/cr-marcstevens/sha1collisiondetection/issues/3 I just saw that Jeff King also works on this: https://github.com/peff/git/tree/jk/sha1dc
It seems Jeff King is making an attempt at using sha1collisiondetection in git, which would also resolve this issue: http://marc.info/?l=git&m=148789111730142&w=2
git-evtag project contains also python script to calculate evtag (you need to manually add/compare it to tag message). But this script seems to not have limitation on handling only HEAD. In only needs that submodules are fetched beforehand. Maybe this is the way to integrate it into qubes-builder?
@marmarek's suggestion is to use the actual key file, e.g., qubes-developers-key.asc, but I don't see how that helps. Suppose I download that file. I now need to verify that the keys it contains are genuine. How can I do that? Presumably, by verifying that their fingerprints match the trusted fingerprints I've already obtained out-of-band. But this just reintroduces the original problem.
It helps in that one does not need to rely on the SHA1 fingerprint alone in case additional strong checksums are provided.
Also note that when a OpenPGP key gets signed by other OpenPGP key the signature is made over the primary public key packet and not its SHA1 fingerprint. Ref Anatomy of a GPG Key which I found to be an excellent read on how GnuPG and OpenPGP work in detail.
Ah, it do need to checkout submodules first - at least initialize appropriate directories with .git
directory. Generally git submodule
works on currently checked out version, not arbitrary revision. And also git clone --recursive
is ignored when git clone -n
is used. This is mostly a problem when cloning new repository, or introducing new submodule. Otherwise it should be enough to run git submodule foreach git fetch
before git-evtag
.
I think the best we can achieve with git-evtag
(if going this way in the first place) is:
git clone -n
git-evtag
- if it fails, rollback the operation - simply remove the directory (it's already how qubes-builder behaves)make prepare-merge
, make get-sources
etc)
git fetch
git submodule foreach git fetch
to fetch new commits in submodulesgit-evtag
- if it fails, nothing needs to be rolled back, because we haven't checkout the new commitsOne problem with this approach is that, in the second case, if new submodule got introduced, git-evtag
calculation would fail (because those commits weren't included in git submodule foreach git fetch
). This is rather rare situation and I think we can handle it manually (doing that checkout and if git-evtag
verification still fails, rolling back changes manually).
The other problem is that, in the first case, we have checked out sources before git-evtag
verification. If verification fails, that repository is removed from disk, but there is short time span when it is there (**). And also some more git parts are exposed before verification of git objects referenced by fetched tag. In short: 1) fetch objects 2) verify tag, 3) checkout objects 4) verify checked out objects.
(*) When I write git-evtag
, I mean that python implementation, which can operate on arbitrary revision, not HEAD.
(**) This potential race condition isn't a problem in upcoming github integration (https://github.com/QubesOS/qubes-builder-github/pull/7) because we use builder-wide locks for all operations.
Great! Did you check it with all our repos?
I did already checked those I had checked out. I now have run it over all the 69 repos GitHub listed. Nothing found (also no other 'fsck' warnings :]).
(In theory GitHub could have hidden bad objects from me.)
How long it takes?
This is normally not noticeable slower than git. Somebody made some measurements against OpenSSL for upstream (https://public-inbox.org/git/20170223230621.43anex65ndoqbgnf@sigill.intra.peff.net/, That's not my patch but it does the same thing and uses the same lib).
It looks like upstream will incorporate a very similar patch. So probably we can just wait until it's packaged for the common distros.
But this script seems to not have limitation on handling only HEAD. In only needs that submodules are fetched beforehand.
IIUC the C/rust version can be easily changed to do the same. The real problem is the handling of submodules as you already noticed.
The other problem is that, in the first case, we have checked out sources before
git-evtag
verification. If verification fails, that repository is removed from disk, but there is short time span when it is there (**).
We could use a temporary directory such that at least nothing accesses the files accidentally (I'm not sure if it's easy to convince git to do that in this case).
And also some more git parts are exposed before verification of git objects referenced by fetched tag. In short: 1) fetch objects 2) verify tag, 3) checkout objects 4) verify checked out objects.
I think that you need to process/parse a lot data before verifying it is in general a big limitation of git. The only thing you can skip (without submodules) is the checkout part AFAIK.
One option would be to mirror the repository locally and then temporarily clone+checkout them for verification. But I don't see how this could be done without a lot of "hackery" which we a) don't want to maintain and b) don't want to have in verification code.
[Admittedly getting off-topic]
It helps in that one does not need to rely on the SHA1 fingerprint alone in case additional strong checksums are provided.
Ok, but in what case would a strong checksum be provided for a key file, and how would that checksum be made trustworthy? Would it be... PGP-signed? If so, we have a regress problem.
Also note that when a OpenPGP key gets signed by other OpenPGP key the signature is made over the primary public key packet and not its SHA1 fingerprint. Ref Anatomy of a GPG Key which I found to be an excellent read on how GnuPG and OpenPGP work in detail.
Ok, but that works better in situations where key trust is conveyed via a Web of Trust, which tends not to be the case in the Qubes community. For a variety of reasons, we rely far more on fingerprints. (One reason, I suspect, is that there doesn't yet exist a fully secure way to sign other people's keys without exposing the private master key doing the signing, even with Split GPG.)
Yes, for signing files, the hash is configurable.
Not however for the OpenPGP key fingerprints as they use SHA1 as specified: RFC4880, 12.2. Key IDs and Fingerprints
I think for our use case of GPG, collision attacks on the keys are mostly useless. [...]
You're right about collision attacks being mostly useless for our use case, but they were talking about second preimage attacks (which, we agree, are not an immediate concern right now).
Ok, but in what case would a strong checksum be provided for a key file, and how would that checksum be made trustworthy? Would it be... PGP-signed? If so, we have a regress problem.
The strong checksum would be made trustworthy the same way any other OpenPGP fingerprint would be made trustworthy (print out, website, phone, t-shirt worn by core dev ;-) ). Just the instructions on how to check authenticity of the key would need to be extended.
But just hashing gpg -a --export
is not very robust. One would need to only hash the primary public key packet. The following handy oneliner should do just that:
$ tmp_dir="$(mktemp -d)"; pushd "$tmp_dir" >/dev/null && gpg --export "427F 11FD 0FAA 4B08 0123 F01C DDFA 1A3E 3687 9494" > pub.key && gpgsplit pub.key && sha256sum ./*.public_key; popd >/dev/null && rm -r "$tmp_dir"
125d9b76ae515b819e882822fb97f531aa3a4d64791e1f244c89b2df23867c35 ./000001-006.public_key
I guess I will include such a hash when I next make a print out of my pubic OpenPGP key :wink:
Ok, but that works better in situations where key trust is conveyed via a Web of Trust, which tends not to be the case in the Qubes community.
True. But at least all important Qubes OS keys are signed by the Qubes Master Signing Key.
Ah, I understand what you have in mind now. So, basically distributing a strong hash of (the appropriate content of) the Qubes Master Signing Key instead of directly distributing the latter's fingerprint? Yeah, I could see that working. Anyway, let's hope this part remains theoretical.
I guess if one decides to do this the strong hash should be provided in addition because SHA1 fingerprints are the default and the way I showed is not that common (and currently not needed). \</offtopic> :)
Looks like git implemented sha256.
But can you find instructions on how to use git with sha256 or how to migrate existing git repositories to sha256?
Looks like git implemented sha256.
[citation needed]
rugk:
Looks like git implemented sha256.
[citation needed]
https://github.com/git/git/blob/master/Documentation/technical/hash-function-transition.txt
https://github.com/git/git/blob/master/Documentation/RelNotes/2.21.0.txt#L114
- sha-256 hash has been added and plumbed through the code to allow building Git with the "NewHash".
Lots of mention of sha256 or sha-256 in git source code.
https://github.com/git/git/blob/master/Documentation/RelNotes/2.21.0.txt#L114
- sha-256 hash has been added and plumbed through the code to allow building Git with the "NewHash".
After looking at the commits, I think this message is a bit unclear. They did some "plumbing" to support SHA-256 but there's still a lot to do until you can actually use it.
Lots of mention of sha256 or sha-256 in git source code.
From what I can see there's only code to add the hash function itself (via OpenSSL or libgcrypt) and some work to fix dependencies on having 20 byte hashes.
For example in 2.22 there are only very few test that mention SHA-256:
$ git describe
v2.22.0
$ grep -ril 'sha-\?256' t
t/t0000-basic.sh
t/t0015-hash.sh
t/test-lib-functions.sh
t/helper/test-tool.c
t/helper/test-sha256.c
t/helper/test-tool.h
t/t9824-git-p4-git-lfs.sh
t/oid-info/hash-info
t/oid-info/oid
t/oid-info/README
$
And the new objectFormat
extension is only mentioned in the hash transition doc, yet.
So unless I'm missing something important we aren't there yet.
Also note that since a while git uses a SHA-1implementation which is hardened against the "SHAttered" attack.
After looking into it all day, it does not appear to impact git's security immediately, except for targeted attacks against specific projects by very wealthy attackers. But we're well past the time when it seemed ok that git uses SHA1. If this gets improved into a chosen-prefix collision attack, git will start to be rather insecure.
Ref: https://git-annex.branchable.com/devblog/day_449__SHA1_break_day/
->
We have computed the very first chosen-prefix collision for SHA-1.
Note that recent git versions use a collision detection algorithm and are therefore not affected by this attack. From the paper:
As a stopgap measure, the collision-detection library of Stevens and Shumow [SS17] can be used to detect attack attempts (it successfully detects our attack). [...] The GIT developers have been working on replacing SHA-1 for a while, and they use a collision detection library [SS17] to mitigate the risks of collision attacks.
But we're well past the time when it seemed ok that git uses SHA1.
note there never was such a time. it got pointed out that building a new anything based on sha1 was a "bad idea"(tm) during the git design/earlyprototype phase, and the cost to use sha512 instead would have been somewhere between "none" and "zilch" at that point, but a certain @thorvalds was busy ridiculing people for being worried about it instead of just changing it. (ohai @zooko )
The GIT developers [...] use a collision detection library [SS17] to mitigate the risks of collision attacks.
that is extra amusing and a reason for pinging @bramcohen too, just to get the whole old band back together.
We have computed the very first chosen-prefix collision for SHA-1. Ref: https://sha-mbles.github.io/
also scary there: the gpg part.
they [git] use a collision detection library [SS17] to mitigate the risks of collision attacks.
And:
Besides, computation prices went further down since then, so we estimate that our attack costs today about 45k USD.
Who is our adversary again? My fear is that this detection might not work against attacks that are not known to the public yet (done by researchers not as nice or not allowed to release their work).
Besides, computation prices went further down since then, so we estimate that our attack costs today about 45k USD. Who is our adversary again? My fear is that this detection might not work against attacks that are not known to the public yet (done by researchers not as nice or not allowed to release their work).
bonus: don't forget that anything taking "10 days with 45k investment" to break will take a skid with carded/cracked AWS account half a day. at scale.
unless someone who actualy understands this digs up an URL for SS17 and draws me a convincing explanatory picture i trust "signed git commits" about as much as "someone on 4chan said..."
unless someone [...] digs up an URL for SS17 [...]
Not sure if this part of the sentence was intended sarcasm but the "SS17" paper about the collision detection technique is freely available (including a video of the conf talk): https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/stevens
It is worth noting that for a malicious commit to be part of an attack, it would first need to be merged into master, meaning that it would need to pass code review and Git’s collision detector. Source code generally has rather low entropy, so finding colliding code that will pass review is likely to be very difficult.
In the case of PGP key fingerprints, there is no problem at all. If the Qubes keys were maliciously generated, it is Game Over anyway.
"merged into master" – I guess the threat model here assumes an attacker that is GitHub or so and can circumvent merge checks or so.
@rugk GitHub being compromised is indeed in scope, but it isn’t adequate in and of itself. One would need to convince someone trusted by the Qubes Builder to sign a tag that points to a commit containing the malicious object. A second-preimage attack on SHA-1 would indeed be catastrophic, but no such attacks (other than brute-force) are known or believed to exist.
Furthermore, Git doesn’t use raw SHA-1. It uses a variant, known as Hardened SHA-1, that is resistant to all known or likely collision attacks. Since SHA-1 only produces a 160 bit hash, there is still a birthday attack with complexity 2^80. However, this attack is not considered practical in the short term. In the long term, Git is already transitioning to SHA-256, and will have finished doing so far before a brute-force collision attack on SHA-1 is practical.
In short, a SHA-1 collision attack on Git is not something I worry about. I am far more worried that someone might find a memory corruption vulnerability in Git, a VM escape in Xen, or a GPG signature check bypass, all of which have happened in the past.
Fortunately, git also switches to SHA-256…
Whatever happened to SHA-256 support in Git?
nobody has said that it is coming anytime soon.
git-evtag
is in Debian bookworm:
https://packages.debian.org/bookworm/git-evtag
~~git-signify
:
https://git.nicholasjohnson.ch/git-signify/~~
@adrelanos @HW42 Is this something that the Qubes team should start working on itself at some point?
Thank you for your question! There are various ways to approach it...
The creator of this ticket made a feature request, and @marmarek expressed interest in exploring git-evtag for its implementation.
As my contribution to this ticket, I have included some additional information:
Ideally, sha256 support would be addressed upstream in git. However, there are limitations imposed by time and funding in the real world.
I do not have the authority to dictate what the Qubes team should do. However, as customary for Open Source projects, I appreciate the feature request and would have made it myself if it didn't already exist.
I cannot presume to know the availability of time, skills, and funding within the Qubes project. Contributing to git upstream to complete sha256 support appears to be extremely challenging and costly. If the Qubes project decides to pursue any of the potential solutions mentioned, that would be excellent. Nevertheless, if conflicting priorities and difficulties prevent their implementation, that would be understandable as well.
Perhaps I am reading too much into the word "should"?
Maybe you wanted to know my opinion regarding the priority of this ticket?
To consider the opposing viewpoint...
@DemiMarie
In short, a SHA-1 collision attack on Git is not something I worry about.
Makes sense.
I am far more worried that someone might find a memory corruption vulnerability in Git, a VM escape in Xen, or a GPG signature check bypass, all of which have happened in the past.
This is an excellent point and likely true. There is much work to be done to enhance security. Regrettably, I estimate that awareness and traction for addressing these concerns are insufficient, making it unlikely for substantial progress to occur.
Additionally, I wonder how much it truly helps if the Qubes source code is "super secured" (using git-evtags or git-signify) while other projects (such as those sourced from Debian, Fedora, etc.) still possess only "normal security" (using git sha1). There may exist other unknown issues as well, given that many upstreams do not sign their code, and distribution maintainers might not verify signatures.
Ultimately, what Qubes can realistically provide at this stage is limited by its upstream projects. It is unrealistic to expect Qubes to single-handedly resolve all security issues on the internet.
Considering some more practical aspects... Once Whonix has been ported to Debian bookworm, I could potentially sign and verify the source code using git-evtag. Since git-evtag is available in Debian bookworm, automating the signing and verification process should be relatively straightforward. Of course, I will verify this assumption and contribute any useful findings. Furthermore, I will also explore git-signify in the future.
Contributing to git upstream to complete sha256 support appears to be extremely challenging and costly.
The hardest part is hybrid support, which is what is needed for hosting providers to be able to support both sha1 and sha256 at the same time.
@adrelanos wrote:
git-signify
: https://git.nicholasjohnson.ch/git-signify/ [...] Another potential option to consider is git-signify, which could serve as an alternative to git-evtag. It might be worth exploring and providing feedback to upstream.
From what I can tell the real upstream is https://leahneukirchen.org/dotfiles/bin/git-signify not the linked repo. Anyway I don't see how this is relevant to this ticket. The linked git-signify
uses signify instead of PGP to sign git tags/commits, but doesn't to anything to address relying on hardened SHA1 in git.
Moving away from PGP is a valid consideration, but since this ticket has been only about SHA1 so far, that discussion belongs somewhere else.
You are going to great extend to ensure source code authenticity already. :+1:
However, it seems to me that the weakest link here is SHA1 used by git. Ref entry point: sign all git commits
As I am not sure when this problem will be fixed at it’s core (git), I would propose to include a cryptographically strong hash sum over the whole commit (commit, tree, and blobs it references and recursively over submodules) in git tags which are directly signed with GnuPG, also using a cryptographically strong hash sum.
I was quite happy to find git-evtag today which implements this. I have read the Python implementation and it looks good to me. But I am sure you guys can do a more careful review of what ever implementation you end up using :wink:
The main advantage this will give us is that targeted attacks one someone doing a
git clone
and the adversary being able to perform a preimage attack on a SHA1 hashed file will be more difficult. But even with just SHA1, according to Mike Gerwitz, such an attack would be even harder when the target already has a authentic copy of a repo and an attack is performed ongit pull
.