bitcoinjs / bitcoinjs-lib

A javascript Bitcoin library for node.js and browsers.
MIT License
5.71k stars 2.11k forks source link

[CONTRIBUTING] Release and auditing tooling/policy #886

Closed junderw closed 6 years ago

junderw commented 7 years ago

One reason I sign my commits as much as possible is because I make it a habit so that if someone sees a commit with me as the author without my signature, they can assume I didn't verify the commit.

If possible, merge my commits without re-committing them (thus removing the signatures) please.

EDIT The below is my summary of the issue based on the rename of the issue to "Set Logic for Merging Pull Requests"


Reasoning

  1. The reason why pull requests exist is to allow for maintainers to review the branch, and merge it.
  2. We can be sure you have merged it only if we A. trust Github or B. you GPG sign a merge (or squash) commit and we can verify your GPG key.
  3. Currently this repo is based on the "trust Github" model, and should be changed.

Proposal

GPG

  1. All maintainers of the repo should create GPG keys and make them public. Upload them to the repo under a folder called "pubkeys" or something just for redundancy, as well as making sure to update them to the keyservers like mit.
  2. All maintainers should verify each other's Fingerprints in a way where a man in the middle attack is not possible (or highly unlikely) and sign each others keys, uploading the signatures to the keyservers.
  3. If any one maintainer has the chance to get their pubkey signed by a respected member of the community, do so... it will extend their web of trust into the bitcoinjs-lib maintainers.
  4. When possible, all commits and tags should be GPG signed (this requires performing the actions locally, as Github's web UI does not and should never support GPG signing in the browser.)
  5. When merging a pull request, ensure they follow pull request guidelines, and merge locally using git merge --no-ff -S theiruser/theirbranch and push to the branch they were requesting to. (-S is not needed if you set git config --global commit.gpgsign 1) This is the "merge commit" method... which I think helps keep the chain of responsibility in the git tree itself. Rebasing does not leave a record of who rebased which commits into the tree, and when they did it.

Pull Requests

  1. All pull requests should be cleaned before merging. (ie. five "fix" commits in a row modifying the same value over and over should be squashed into one commit, etc.)
  2. Large commits are fine, but only if they all serve one purpose. If not, they should be split into multiple commits.
  3. Renaming a file and changing that file in the same commit is hard to review, and should be avoided. Rename in one commit, change in another commit.
  4. All merge conflicts should be resolved before merging, and if two pull requests are conflicting, merge one, then ask the author of the other to rebase on master and resolve any merge conflicts.
dcousens commented 7 years ago

@junderw I use "Rebase merging" instead of merge commits, however I agree the result isn't great if it re-writes the author. Your commits should be able to rebased without requiring "re-commit"ing them.

I'll look into a solution.

dabura667 commented 7 years ago

The Github "Rebase merging" is basically rebasing my branch on master rather than merging it with a merge commit.

If you rebase onto master, you will be required to recommit (as git, similar to blockchains, commits depend on the parent hash, so my signature will definitely become invalid)

In the Github UI, it shows my picture large and your picture small in the lower right corner.

This means that the commit's "Author" is me, and the "Committer" is you. (You'll also notice some of your own commits have your own picture twice... that's because the user.name of the Author and Committer are both dcousens, but the user.email of the two differ. If you make sure the user.email attribute when coding and the primary email of your github account are the same it should fix your commits from being two icons)

Unfortunately, the Github web interface doesn't support GPG signing... only showing the verified status of GPG signed commits pushed to the site.

I trust Github... and I trust you... but I trust GPG as software more than I trust Github as software, and GPG is one of the only ways I can verify you committed something and vice versa.

(remember, Github is not exactly immune to vulnerabilities: https://arstechnica.com/information-technology/2012/03/hacker-commandeers-github-to-prove-vuln-in-ruby/ While something like a hacker gaining access to the bitcoinjs-lib repo and rewriting history to include a backdoor committed under your name, could easily be caught if someone tried to pull master and got the error that the history has been rewritten... but most people would just assume you had to fix something, and just force pull anyways...)

dabura667 commented 7 years ago

https://stackoverflow.com/questions/5520425/how-can-i-rebase-a-commit-made-by-another-author-without-adding-myself-as-the-co

Looks like if you are fast forwarding (ie, my branch was based on the current master HEAD) there are options in the git command line to maintain the authorname authoremail and authordate as the commit data, which will keep the SHA1 hashes the same...

However, if my branch forked any time earlier than current master HEAD, the SHA1 hashes will change no matter what.

Not sure if the above options will leave GPG signatures intact... I'll have to test out...

dabura667 commented 7 years ago

requiring a rebase to master HEAD for every PR after each PR is merged is kind of tedious though...

junderw commented 7 years ago

Actually... it looks like Github's "rewrite the committer" method is unique to Github, no default setting of merge or rebase will produce the same results as rebase and merge in Github UI.

https://help.github.com/articles/about-pull-request-merges/#rebase-and-merge-your-pull-request-commits

dcousens commented 7 years ago

Lame.

I guess I could rebase locally? It marks the PR as merged when pushed anyway IIRC.

junderw commented 7 years ago

I guess I could rebase locally?

You could try it on the maximumfee PR...

so you would want to be on master, make sure it was up to date with origin, then rebase my branch..

git checkout master
git pull origin master
git add remote junderw git@github.com:junderw/bitcoinjs-lib.git
git fetch junderw
git rebase junderw/fixmaxfees

Then check git log via git log --show-signature to check if my commit signatures are intact...

dcousens commented 7 years ago

@junderw ooi, in future, could you put your forks in local branches to avoid adding the remotes? :) (on average)

junderw commented 7 years ago

umm... I have write permissions to this repo? TIL...

junderw commented 7 years ago

I tested, and it seems to work...

So I'm guessing there will need to be new rules for Pull requests:

  1. Must rebase to master HEAD before merging.
  2. Must merge via git rebase locally.

btw, you can automate the whole process of adding and removing remotes if you want using a bash function. (see my post in slack for an example I gave that uses forced merge commits that are GPG signed with -S )

you could make

gitrebase() { git remote add $2 git@github.com:$2/$1.git; git fetch $2; git rebase $2/$3; git remote remove $2; }

Usage: gitrebase bitcoinjs-lib junderw fixmaxfees

junderw commented 7 years ago

My proposal (I editted the main OP message):

Reasoning

  1. The reason why pull requests exist is to allow for maintainers to review the branch, and merge it.
  2. We can be sure you have merged it only if we A. trust Github or B. you GPG sign a merge (or squash) commit and we can verify your GPG key.
  3. Currently this repo is based on the "trust Github" model, and should be changed.

Proposal

GPG

  1. All maintainers of the repo should create GPG keys and make them public. Upload them to the repo under a folder called "pubkeys" or something just for redundancy, as well as making sure to update them to the keyservers like mit.
  2. All maintainers should verify each other's Fingerprints in a way where a man in the middle attack is not possible (or highly unlikely) and sign each others keys, uploading the signatures to the keyservers.
  3. If any one maintainer has the chance to get their pubkey signed by a respected member of the community, do so... it will extend their web of trust into the bitcoinjs-lib maintainers.
  4. When possible, all commits and tags should be GPG signed (this requires performing the actions locally, as Github's web UI does not and should never support GPG signing in the browser.)
  5. When merging a pull request, ensure they follow pull request guidelines, and merge locally using git merge --no-ff -S theiruser/theirbranch and push to the branch they were requesting to. (-S is not needed if you set git config --global commit.gpgsign 1) This is the "merge commit" method... which I think helps keep the chain of responsibility in the git tree itself. Rebasing does not leave a record of who rebased which commits into the tree, and when they did it.

Pull Requests

  1. All pull requests should be cleaned before merging. (ie. five "fix" commits in a row modifying the same value over and over should be squashed into one commit, etc.)
  2. Large commits are fine, but only if they all serve one purpose. If not, they should be split into multiple commits.
  3. Renaming a file and changing that file in the same commit is hard to review, and should be avoided. Rename in one commit, change in another commit.
  4. All merge conflicts should be resolved before merging, and if two pull requests are conflicting, merge one, then ask the author of the other to rebase on master and resolve any merge conflicts.
dcousens commented 7 years ago

Currently this repo is based on the "trust Github" model, and should be changed.

We have cross-dependencies that make this model near impossible to alleviate entirely. We trust npm, and GitHub, and Node js, and browserify (if you use it).

Isolating our merge/code security from GitHub only helps for the commits in this repository, not in our dependency tree. Our commits are arguably the easiest to re-verify too!

Without a monolothic code base (e.g, dependencies committed), this is not an easy problem to tackle.

In fact, our best course of action, as of now, is simply verifying the commit hashes and their contents locally at each release/merge and locking your dependencies down.

gpg helps only if retrieving fresh from GitHub, and you worry about MITM. I could easily sign a commit hash at each release and that would be equivalent.

junderw commented 7 years ago

gpg helps only if retrieving fresh from GitHub, and you worry about MITM.

I could easily sign a commit hash at each release and that would be equivalent.

https://mikegerwitz.com/papers/git-horror-story

I think the "since our dependencies don't do it, we can't be 100% safe, so adding something that would take us from being 0% safe to a non-zero X% safe is pointless." mindset is not the right mind to have. Especially when the only overhead is a one-time setup of GPG.

If that were so, Bitcoin Core shouldn't GPG sign anything because they use standard libraries from C++ and someone can't verify that there's no backdoor in std libraries.

If you are only signing tags (aka signing releases) you should be verifying every commit between the last signed release and your current HEAD. Which is a lot more overhead than just setting up GPG to auto-sign each commit (now you can know cryptographically that no one has modified a previous commit. You could even automate the GPG password entry using gpg-agent and pinentry-mac. (or the gnu keychain for Linux) (though I would recommend against automating it. Just let gpg-agent cache the password for 90 seconds or something just in case you need to rebase and process multiple commits in one command)

dcousens commented 7 years ago

If that were so, Bitcoin Core shouldn't GPG sign anything because they use standard libraries from C++ and someone can't verify that there's no backdoor in std libraries.

To be fair, Bitcoin Core commits its dependencies as sub-trees. Maybe my example of us depending on Node.js could be considered over the top, but, that ecosystem changes considerably faster than any gcc/g++ and stdlib version that Bitcoin Core would use.

is pointless

Maybe I gave the wrong impression, I didn't mean to say it is pointless. I simply meant, we have other attack vectors too and this is the least likely of them IMHO.

Potentially providing an in-repo script to at least verify the dependencies against their GitHub counterparts would help... then GPG could be marginally useful for verifying the dependency hashes AND the code here.

Everything we do to improve the safety of our users and ourselves, is worthwhile.

dcousens commented 6 years ago

Could we have TRAVIS verify the npm publication against a committed checksum? Ideally we'd then want to TRAVIS/whatever to re-trigger if a new published version is found.

Additionally, we have few enough dependencies for this library that we could maintain a set of safe hashes for them... ?

dcousens commented 6 years ago

See https://github.com/bitcoinjs/bitcoinjs-lib/pull/1042 for further discussion

dcousens commented 6 years ago

I'm re-opening this as it wasn't completely satisfied by https://github.com/bitcoinjs/bitcoinjs-lib/pull/1042, and we do want to maybe introduce some tooling or documentation about how to audit/verify the repository against npm or a package management provided bundle.

GPG, SHA256 sums, or otherwise.

dcousens commented 6 years ago

Continued the discussion around this. What a difficult problem.

Without a release binary or self-contained package that we actually release... this is damn difficult. External dependencies are the problem, despite their benefits. The best we can do is continue to advise users to trust nothing, review everything.