dhuseby / did-git-spec

Proposed specification for the did:git: method
Apache License 2.0
15 stars 5 forks source link

Security Considerations should mention git forking and history rewrite attacks. #7

Open msporny opened 5 years ago

msporny commented 5 years ago

I may be wrong here, but I think the git:did method is going to be susceptible to rewrite attacks. That is, an attacker gets a git repository and either forks, rewrites, or appends history. Now you have two .git repositories w/ the same initial commit hash, that have subtly different histories.

I don't think you can mitigate this attack (unless you couple this DID Method w/ some sort of global append-only audit log), the best you can do is detect the attack and do something about it (like, check to make sure you're using the "official repository" for the project). Even that is susceptible to attack: https://about.gitlab.com/2019/05/03/suspicious-git-activity-security-update/

In any case, you should document this in the Security Considerations section of the DID Document.

dhuseby commented 5 years ago

@msporny you're making a good point. I was operating under the assumption that with did:git, repos would require that every commit be signed by one or more keys stored in the repo so that the provenance on the DID docs anchors the trust in the commits. If the governance of the repo contains the rules that govern which keys can sign updates to the repo DID document which keys can sign updates to the committer DID documents, this kind of attack can be detected. Detection combined with anchoring externally should be all we need to prevent these kinds of attacks. But I could be wrong. Here's an example of how I envisioned the Linux Foundation using and defending a repo:

  1. Run git init to create the repo.
  2. Copy controller DID docs (i.e. DID docs for the maintainers), governance.md doc, DCO.md, and LICENSE file into the repo.
  3. Then run git did init. This generates the repo.did document and populates its fields, including the list of key IDs of the controller DID docs as well as which file is for governance rules links to the DCO and LICENSE. It then stages all of the files and does a signed commit in the repo. NOTE: This has to be done by one of the maintainers because it the commit has to be signed by one of the controller DID keys.
  4. New committers send self-signed commits that add their committer DID doc to the repo before they can contribute. Someone permitted to merge per the governance rules (e.g. a maintainer) merges their commit and adds their signature as well. NOTE: we plan to support multiple DID signatures on commits.
  5. Only commits that are signed by committer keys already in the repo will be accepted.
  6. Key rotations are done using a git mv <old key ID>.did <new key ID>.did and the commit is signed with both the old and the new key. Governance rules can be used to defend key rotations by committers with merge capabilities by requiring some M of N merge capabilities holders to also sign the key rotation and will hopefully confirm the rotation out of band before signing.
  7. The repo-wide documents such as the repo.did, governance.md, DCO.md, and LICENSE files will be defended using governance rules requiring M of N merge capabilities holders to sign the commit. Hopefully there will be some out-of-band consensus mechanism (e.g. a meeting and vote by the maintainers and/or the broader community).

Can you think of any attacks against a regime like this? The did:git spec isn't meant to be an air-tight defense against forking and history rewrites but by combining meat-space protocols and explicit governance rules combined with the git history and requiring all commits to be signed by keys already in the repo, I think detection of the attacks and authentication of forks is possible.

I'd love to game this out more if you've got more ideas on attacks.

msporny commented 5 years ago

I'd love to game this out more if you've got more ideas on attacks.

I do, but very little time, unfortunately... so my feedback is going to be sporadic, where I disappear for large periods of time... so as long as you're cool with that, some thoughts below. :)

Run git init to create the repo. Copy controller DID docs (i.e. DID docs for the maintainers), governance.md doc, DCO.md, and LICENSE file into the repo.

Not an attack, (but it will become one, potentially). If you choose to do it this way, which we had considered before, then you can't bring legacy repos into a "signed by DIDs" ecosystem w/o doing a full history rewrite. That poses an on boarding problem that is going to harm adoption.

Instead, you should make it so that any git repo can be converted to being protected by DID-based signatures. One way you could do that is enable your steps #2 to happen at any point in history in the git repo (where all previous hashes are signed as a part of git did init).

... aaaaand, once you do that, you're susceptible to the fork/rewrite attack, no?

NOTE: we plan to support multiple DID signatures on commits.

How? Suggest that LD Proofs enable multiple signatures (since you already seem to be planning to use JSON-LD)... :)

Key rotations are done using a git mv .did .did

Hmm, so this is a bit of a strange DID method (I make no value judgement, yet... just making an observation). It feels like two things shoved together, namely 1) DID-based digital signatures on commits, and 2) a DID and key rotation mechanism. It feels like the two things should be split apart to create the right architectural layering.

It feels like what you want is a set of files that express governance rules for a .git repo -- who can commit, who is allowed to add committers, how many signatures are required for a merge to master, etc. and an extension to git that handles this. This layer should be DID Method agnostic.

Then create a separate layer that is a DID Method supporting all the CRUD operations for those that don't want to use a DLT-based DID method.

I will note that the forking/rewrite attacks are are applicable to both... but feels like the layering is off and how we address forking/rewrites are going to be predicated on the layering discussion.

Can you think of any attacks against a regime like this?

If the goal is to trick people into thinking a repo is valid, then I'd git init w/ my own keys and create a tool that recreates the "good repo"s history while injecting my attacker keys as the maintainer of the repo. Then I'd just try to trick people into copy-pasting the wrong did:git value into their git clone command / did git validate. In other words, feels like the easiest attack would be a social engineering attack, not necessarily a crypto or protocol attack.

dhuseby commented 5 years ago

I'd love to game this out more if you've got more ideas on attacks.

I do, but very little time, unfortunately... so my feedback is going to be sporadic, where I disappear for large periods of time... so as long as you're cool with that, some thoughts below. :)

I'm OK with that. There are so many subtle details and it's going to take all of us to make this happen.

Run git init to create the repo. Copy controller DID docs (i.e. DID docs for the maintainers), governance.md doc, DCO.md, and LICENSE file into the repo.

Instead, you should make it so that any git repo can be converted to being protected by DID-based signatures. One way you could do that is enable your steps #2 to happen at any point in history in the git repo (where all previous hashes are signed as a part of git did init). ... aaaaand, once you do that, you're susceptible to the fork/rewrite attack, no?

This was the plan all along, sorry for not being clear about that. At some point in the history the repo will be imbued with did:git governance. That point is also when any attestations about the existing history must be made. There are several strategies but I best like the one that includes a Merkel tree of past commit hashes as a file in the initial "gensis" commit. In theory, if the commit hashes were stronger, we wouldn't need to do anything because the signature over the "genesis" commit would include the hash of the parent commit. Any change in history would ultimately change the hash of the parent commit. If a rewrite attack is in the threat model of the project then anchoring the DID string for the repo--which is the hash of the genesis commit--is necessary for authentication by others.

NOTE: we plan to support multiple DID signatures on commits.

How? Suggest that LD Proofs enable multiple signatures (since you already seem to be planning to use JSON-LD)... :)

Precisely what I was thinking. :)

Key rotations are done using a git mv .did .did

Hmm, so this is a bit of a strange DID method (I make no value judgement, yet... just making an observation). It feels like two things shoved together, namely 1) DID-based digital signatures on commits, and 2) a DID and key rotation mechanism. It feels like the two things should be split apart to create the right architectural layering.

It feels like what you want is a set of files that express governance rules for a .git repo -- who can commit, who is allowed to add committers, how many signatures are required for a merge to master, etc. and an extension to git that handles this. This layer should be DID Method agnostic.

Then create a separate layer that is a DID Method supporting all the CRUD operations for those that don't want to use a DLT-based DID method.

I will note that the forking/rewrite attacks are are applicable to both... but feels like the layering is off and how we address forking/rewrites are going to be predicated on the layering discussion.

We are defining CRUD DID methods for both the repo DID (e.g. did:git:<repo ID>) and committer DID (e.g. did:git:<repo ID>:<commiter ID>). The is the commit hash of the "genesis commit". The is the likely to be the hash of the public key in the committer DID file.

All CRUD operations go through the planned git-did porcelain that we hope to land in the git.git repo as a porcelain that ships with git by default. The git-did porcelain will take DID strings and options as command line parameters and translate that into git operations.

The CRUD operations on the repo DID string operates on the repo DID file. So I made a mistake earlier when I said git did init. It will likely be git did create to be the C in CRUD. The CRUD operations on the committer DID string operates on the committer's DID file. All operations translate into underlying git operations manipulating the DID files.

The other governance stuff is how we take meat-space protocols and encode them so that they are enforced when git commits are made. Each project has different rules for how commits are landed and we want to be flexible to support as many options as possible. I plan on making a good default that meets the Linux Foundation's needs but there is no reason why we couldn't have other examples/templates that users can choose from. In the end, whatever form it takes, it is the functional equivalent of "smart contracts"/"transaction validation" in a DLT. I started the discussion about that over here

Can you think of any attacks against a regime like this?

If the goal is to trick people into thinking a repo is valid, then I'd git init w/ my own keys and create a tool that recreates the "good repo"s history while injecting my attacker keys as the maintainer of the repo. Then I'd just try to trick people into copy-pasting the wrong did:git value into their git clone command / did git validate. In other words, feels like the easiest attack would be a social engineering attack, not necessarily a crypto or protocol attack.

External anchoring is the primary defense against this. Also, project owners (i.e. DID controllers) can add a "canonical location" to the repo's DID document that is anchored via the DID string. It is important to remember that the domain of trust for did:git is the repo itself. Only the repo DID string needs to be globally unique and the DID's only mean something in the context of the repo. This is in line with the decentralized philosophy of git's design where there is no such thing as the repo; there's just lots of clones. The meat-space protocols are designed to solve that problem and I think as long as we anchor the meat-space protocols into the repo (e.g. governance.json, canonical URL, etc) and then anchor the repo somewhere public, that should be good enough.

I'm always open to discussing where I/we are wrong on the current design. Right now I feel like we're just level setting with communicating all of what we discussed at IIW.

I'm personally optimizing for solving the LF DCO problem which is as much a legal and human protocol as it is a technical one. I am hoping to leverage SSI and DIDs and DID proofs to enhance Git in a way that doesn't change it's design philosophy but allows the LF to strictly enforce their legal and human protocols. I think if I can solve that, it will also be useful for others.