etotheipi / BitcoinArmory

Python-Based Bitcoin Software
Other
826 stars 619 forks source link

Gitian: Add signassert.py for signing Gitian assert files #306

Closed josephbisch closed 9 years ago

josephbisch commented 9 years ago

@droark

For use in conjunction with gsign from gitian-builder. The associated gitian-builder PR is devrandom/gitian-builder#93.

This is heavily based on signannounce.py.

While doing this, I realized that there needs to be some way to find all the previously built versions of Armory and include their hashes in the file to be signed. That is the way the announce file currently works. This script currently just signs a single assert file for a single version of Armory for a single OS. I'm not sure the best way to handle making a file that is more like the current announce file. I'm not sure that adding functionality to gitian-builder to combine assert files into some announce file format makes sense, since it seems like the exact format would be Armory-specific. I think it might be best to have a script in release_scripts that Armory uses to take a bunch of assert files, verify the signatures on all of them, and then extract the relevant data and put it into the format that the announce file currently uses. Then that final announce file would be signed.

You would then make all the signed announce files from some group of employees available somewhere for the Secure Downloader to download and verify.

The alternative is to make a signed master list of assert files. That way there is no need to combine assert files into an announce file format. Either way would involve downloading multiple files, but this way probably involves more change to the Secure Downloader than the other way.

droark commented 9 years ago

Thanks for the PR. I'm not totally sure if the wallet usage is the best idea, unfortunately. There's something weird about using a wallet to sign everything. I get the underlying idea, which is fine. Perhaps there isn't a better way to do things?

I'm not totally sure of the best way to proceed. Perhaps @etotheipi will have some ideas regarding whether or not wallets are the best solution.

etotheipi commented 9 years ago

Using wallets for signing is how it's done now. Rather, a specific private key within an offline wallet is how we sign announcements now. This is because I obviously want that key protected, and we've implemented private key security about as thoroughly as you can in software through Armory's wallet management.

Jasvet may be going away, but it's what we use right now. When it does go away (when wallet2.0 is done) it won't be difficult to transition. It'll probably just be changing what code we use to sign or verify blocks of text.

Re: "I realized that there needs to be some way to find all the previously built versions of Armory and include their hashes in the file to be signed." I don't think that's necessary (though I don't have full context of why you mentioned it). We want to sign the current version of Armory, and if the git branch hash is included anywhere (directly or indirectly) in what is signed, it covers everything before it.

droark commented 9 years ago

Ahhh, okay. Thanks for clarifying everything. I also like the Git branch hash being included.

josephbisch commented 9 years ago

Regarding the quotation from me about signing a file with the hashes of previously built versions of Armory:

I was looking at dllinks.txt, which I thought was the file Armory looks at to find updates. I see how that file has some older versions of Armory (I guess for people who don't update frequently). So I don't see the connection between this new way of signing assert files and this dllinks.txt file. The assert file makes the connection between the source code at a particular commit and the hash of the resulting binary, but it won't do anything to verify the hash of any older binaries. So it would seem like we still need to make the dllinks.txt file from the assert files, or Armory needs to be able to handle multiple sets of assert files (one set of assert files per OS-version combo we want users to be able to upgrade to). I say sets of assert files, because I see there being multiple signers that are being verified by the Secure Downloader, as I have mentioned before. Maybe there is some simple way that this can all work out that I am not seeing?

And btw, the assert file from Gitian includes the git hash, so that covers signing the git hash by us signing the assert file.

etotheipi commented 9 years ago

dllinks.txt contains some previous versions of Armory, primarily because it seems each new release breaks compatibility with some system that used to work properly. So we wanted to keep those available to people who lose compatibility with the new versions. But it's not really intended to be an archive of previous versions.

Unfortunately, switching to multi-signing everything I think would be too drastic. The current release scripts are complex enough, without having to include multi-signature bundles for everything.

What I'd like to do is start small: use the gitian reproducible builds with multi-signing to verify on the offline computer that the offline key is signing the right thing. In other words, the offline computer will verify the batch of signatures on a given package before allowing it to be included in the standard release process. Then, add the signatures to the release somehow, so that ambitious users can manually verify them if they want.

Someone had mentioned the idea a separate repo that was just for collecting signatures that all signers would have commit access to. I like the idea that we aggregate signatures in that manner:

Repo: etotheipi/armoryreproduciblesigs

This makes all signatures easily accessible to the public, and easily found and queried by software. If you need to transfer to an offline computer, you can just clone the repo and take it with you. Then, at a later point in time, we switch the secure downloader to actually using that. But I don't want to start there, because there's some complexity involved. At least until we update the cleartext message signing format to support multiple sigs in a single file.

droark commented 9 years ago

@etotheipi - The separate repo is what Bitcoin uses for Gitian sigs. I also think it's a good idea. I'm fine with it being replicated for Armory.

etotheipi commented 9 years ago

About arg-parsing: so the whole thing with ArmoryUtils doing irreversible operations upon import was a legacy thing that was never resolved (technical debt in the project). It was done like this forever ago when Armory was still just a personal project, and then it was never fixed to allow ArmoryUtils to be importable without doing all these other things that impact what you want to do from the method that imports (for instance, we can't flip the USE_TESTNET flag for some unittests but not others).

I know that jsong resolved some of this in the wallet2.0/0.93-hdwallet branch. My understanding is that he wrapped up all that stuff into a function, and made that part of an initialization call instead of it happening automatically on import.

However, I believe that even on that branch, you'll still have issues with arguments. A couple times I have come back to this, trying to figure out how to tell ArmoryUtils to ignore arguments that are not explicitly defined there, so that they can be parsed by the module that imported it. Unfortunately, I never found a solution that works -- argparse will fail if the CLI options aren't defined in that list in ArmoryUtils. My solution was the inelegant one: just add the arguments to ArmoryUtils.py permanently, even if they are just used for some isolated use cases. I never quite figured out a clean way to do this. If you find a clean way to do it, let's do it, but I'd say we should just do the inelegant thing and move on for the moment.

josephbisch commented 9 years ago

I decided to follow your advice and just put all the argument parsing stuff in ArmoryUtils.py and move on.

This commit is probably ready to be reviewed and then merged. I think the integration into the existing release process to prevent the release from happening unless there is a certain number of valid signatures that @etotheipi talks about should be a separate PR. I was able to use Armory to verify the signature that the script in this PR made by using the message signing/verification window. However, I think that the verification should be automated and part of the release script or imported from another script, so that the full Armory GUI doesn't have to be launched and so that there is no need to manually copy and paste the signed message into Armory.

I guess I would add the verification before this line in release_scripts/Step2_Offline_PackageSigning.py. And exit out of the script if the verification fails to meet a certain threshold for number of valid signatures.

I also agree with the separate repo for signatures idea. I had actually started putting together a repo a while ago under my account to be forked by the @etotheipi account, but the structure probably needs updating and it also has rather old signatures just based on the current git commit hash at the time rather than on a particular version or pre-release version of Armory. I guess the time to revisit the repo is when the code going into the first 0.94 testing release is set and we have some signatures of that.

droark commented 9 years ago

Hello. Thanks for the update. I'll try to review everything tomorrow but it might be Tuesday, depending on how much work I can get done on the road. Everything sounds good. I just need to check out the details, obviously.

josephbisch commented 9 years ago

I made almost all the changes, except I couldn't figure out how to handle checking to see if the wallet is encrypted and only asking for the passphrase in that case. I looked through PyBtcWallet.py and nothing stuck out to me. Maybe if I look through again, I will see something.

josephbisch commented 9 years ago

I was about to modify the step 2 offline release script to verify the signatures by downloading them from the GitHub repo, when I realized that can't be done on the offline computer. I guess we can transfer the cloned GitHub repo from the online computer to the offline computer (I think Alan mentioned that before)?

etotheipi commented 9 years ago

Sorry I haven't chimed in here yet.

(1) Wallet locking is checked using the .isLocked member. It was never wrapped up in an accessor method, though it probably should've been. Throughout the code it uses "if wlt.useEncryption and wlt.isLocked: ..."

(2) Modify the Step1 script which already clones the main BitcoinArmory project and puts it into the target export directory. Just have it additionally clone the sigs/reproducible repo into the same place (next to where it clones BitcoinArmory)

josephbisch commented 9 years ago

I almost have the initial changes to the script done (I haven't tested it because the repository and signatures don't actually exist yet, so there may need to be some tweaking once those actually exist to test with). In addition to verifying the signatures I need to parse the YAML (assert file) to verify that the hashes match. So I used PyYAML to try to parse the YAML, because it seemed like the best choice. But it had issues with parsing. So I tried pasting the assert file into an online validator (http://yaml-online-parser.appspot.com/) and it gave me the following message:

ERROR:

could not determine a constructor for the tag '!omap'
  in "<unicode string>", line 1, column 5:
    --- !omap
        ^

So it appears that the YAML that the assert file uses is invalid. I will try to find the appropriate person in the Bitcoin community to find out why the YAML appears to be invalid. The Gitian verifier tool is written in Ruby and seems to be able to parse the assert files find based on its source code, but maybe it just is able to parse certain non-valid YAML?

Hopefully @devrandom can shed some light on this (since he wrote the Gitian tools, I assume he at least had some say in the assert file format and would know about this).

josephbisch commented 9 years ago

It seems like adding a second ! to make it !!omap makes the validator happy. Not sure why the assert file uses a single !. The offical YAML website agrees with the use of two ! characters.

devrandom commented 9 years ago

Hm... maybe there's a standardization issue with the ruby YAML generator.

josephbisch commented 9 years ago

I can't figure out why I am having this issue. There seems to be no relevant Ruby bugs. This SO question has an answer that explains the significance of certain characters in YAML, including one vs two exclamation marks. So it sounds like the two marks is the correct usage. I tested it, and Ruby does seem to successfully parse the !!omap as well as !omap. So really my issue comes down to how do I get Ruby to output !!omap instead of !omap.

I guess it is time to create a SO account and ask about this over there.

josephbisch commented 9 years ago

I posted a bug with Psych, because I posted a question on SO and I got a response that thought it looked like a bug with Psych.

josephbisch commented 9 years ago

According to the Psych maintainer, the !omap usage is a legacy thing from the old YAML library (Syck). He said I should be able to work around it by telling PyYAML that !omap is the same as !!omap. I'll still have to work around it, because there is no telling how long it will take people to upgrade, but I did send a PR for Psych (it is very simple, just one line).

So now I will read the PyYAML documentation to find out how to tell it that !omap is !!omap.

josephbisch commented 9 years ago

For future reference when I go to finish up the signing stuff:

I seem to have figured out the correct way to get PyYAML to recognize !omap as being the same as !!omap. I used the following line:

yaml.add_constructor(u'!omap', yaml.constructor.SafeConstructor.construct_yaml_omap)

It outputs !!python/tuple instead of !!omap when dumping the YAML, but we really only need to parse the YAML (just to get the hash of Armory build result), so that isn't a problem.

An example of how to navigate the resulting Python structure:

>>> yaml.load(open('bitcoin-armory-linux-res.yml'))[0][1]
'80eb9f6c7529b74dfe0e36e34fbd44b2d98ca0bb928cdf4bf8dfe4b86001259e  bitcoin-armory-gitian-linux64.zip\n'

If there are multiple build outputs, they will be in the form hash, two spaces, filename. And that will all be newline separated (you can see the \n at the end of the filename in the example.

The specific line that should give us a list of hash-filename pair strings to feed into sha256sum to verify is:

>>> yaml.load(open('bitcoin-armory-linux-res.yml'))[0][1].split('\n')[:-1]
['80eb9f6c7529b74dfe0e36e34fbd44b2d98ca0bb928cdf4bf8dfe4b86001259e  bitcoin-armory-gitian-linux64.zip']
devrandom commented 9 years ago

Great. If you find out a way to get Psych to emit the standard incantation, please let me know, but otherwise sounds like this is a solution.

josephbisch commented 9 years ago

@devrandom - I filed a PR with Psych to fix the issue. It is only that one line if you want to test it locally. I tested the PR. It seems like loading a simple example with a version of Psych based on that PR works as expected for both !!omap and !omap (i.e. the order is retained when either of those are used). So there shouldn't be any issue with loading existing YAML that was generated by Psych.

But I am still going ahead with this workaround, because I don't know how long it will be until a version of Psych with my PR is released and how long until people upgrade.

droark commented 9 years ago

Everything looks okay to me now. Will commit.