AgileVentures / projectscope

MVP dashboard for ProjectScope, using new gems architecture developed by AV folks
2 stars 14 forks source link

Chore/encrypted attributes #10

Closed armandofox closed 8 years ago

armandofox commented 8 years ago

Added use of attr_encrypted gem to encrypt API keys (metric config) and raw data sample values while at rest. The use of the gem is transparent - you can pretend they are regular ActiveRecord attribute columns. BUT it does require a migration to rename/add database columns, AND (important) a change to the Travis config to support decrypting the file in CI. Once this PR is merged, the Travis change (described in README.md) will have to be performed before it passes CI, and the Heroku change (described in README.md) will have to be performed when the app is first deployed to Heroku.

tansaku commented 8 years ago

For reference the CI is currently failing on:

0.08s$ gpg --passphrase "$APP_SECRET" --output config/application.yml --decrypt config/application.yml.asc
gpg: AES encrypted data
gpg: gpg-agent is not available in this session
gpg: encrypted with 1 passphrase
gpg: decryption failed: bad key
The command "gpg --passphrase "$APP_SECRET" --output config/application.yml --decrypt config/application.yml.asc" failed and exited with 2 during .
tansaku commented 8 years ago

the README currently specifies

Managing the app secret

The file config/application.yml.asc is a symmetric-key-encrypted YAML file that itself contains the encryption keys for encrypting sensitive database attributes at rest. It is safe to version this file.

Let us call the key used to encrypt this file the "main secret".

This means the file config/application.yml must be created on the fly (and should never be versioned), by decrypting application.yml.asc with the main secret, to run the app or its tests. Here's how to do it:

export APP_SECRET=<the application secret goes here>
gpg --passphrase "$APP_SECRET" --output config/application.yml --decrypt config/application.yml.asc

If the value of APP_SECRET or any of the values in config/application.yml are changed, you must:

  1. regenerate config/application.yml.asc with `gpg --passphrase "$APP_SECRET" --output config/application.yml.asc --armor --encrypt config/application.yml"
  2. commit the new version of config/application.yml.asc
  3. if $APP_SECRET changed, change the value of the corresponding environment variable in CI, as above
tansaku commented 8 years ago

@armandofox does this mean we have to receive a copy of APP_SECRET from you in order to work with this? If so this appears to create a similar situation to the one we have on autograders where only "blessed" developers can run the test suites green locally ...

I just made all the tests pass locally by adding the following:

ENV['attr_encrypted_key'] = '123456789012345678901234567890123456789012345678901234567890'

to config/environments/test.rb (that's a kind of hack - there's a rails config gem to do things like that more cleanly if you think it makes sense to go this way: https://github.com/railsconfig/config )

Would this (or something similar) be sufficient for the testing (and dev?) environments in order to make the CI simpler and avoid a barrier to entry for other OS developers?

I'll push my alternative to another branch to see if it works on CI ... see #12 - it's green

tansaku commented 8 years ago

sorry the below is long - I'm partly just trying to document my entire process for back reference - if I can work out a really succinct summary I'll try and share that in my best guess as to the best medium for comms.

I'm not necessarily expecting you to read all this - my style for solving problems is to talk or write through them as I go - apologies if it's a frustrating wall of text.

So now that I have the secret key, I've decrypted the application.yml.asc as per the instructions and the tests failed with the following (locally):

key must be 32 bytes or longer (ActionView::Template::Error)
./app/models/config.rb:7:in `block in <class:Config>'
./app/helpers/projects_helper.rb:5:in `block in setup_metric_configs'
./app/helpers/projects_helper.rb:3:in `each'
./app/helpers/projects_helper.rb:3:in `setup_metric_configs'
./app/views/projects/_form.html.haml:2:in `_app_views_projects__form_html_haml__2013861216600769285_70108392959980'
./app/views/projects/new.html.erb:3:in `_app_views_projects_new_html_erb___1136245759546428509_70108411221700'
features/add_project_with_config.feature:12:in `And I fill in "Name" with "Test Project"'

I've fixed that by extending the length of the test key, and then I re-encrypted the application.yml with a longer test key, as per the README instructions. I note I had to go through the following process to set up a local key, after encountering this error:

→ gpg --passphrase "$APP_SECRET" --output config/application.yml.asc --armor --encrypt config/application.yml
You did not specify a user ID. (you may use "-r")

Current recipients:

Enter the user ID.  End with an empty line: 
gpg: no valid addressees                     
gpg: config/application.yml: encryption failed: no such user id

fixed with the following

→ gpg --gen-key
gpg (GnuPG) 1.4.20; Copyright (C) 2015 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
Your selection? 1
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 
Requested keysize is 2048 bits   
Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0) 
Key does not expire at all
Is this correct? (y/N) y

You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) <heinrichh@duesseldorf.de>"

Real name: Sam Joseph
Email address: tansaku@gmail.com
Comment: tansaku                
You selected this USER-ID:
    "Sam Joseph (tansaku) <tansaku@gmail.com>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
You need a Passphrase to protect your secret key.    

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
.+++++
+++++
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
+++++
.....+++++
gpg: key 341352D4 marked as ultimately trusted
public and secret key created and signed.

gpg: ... <snip>

[tansaku@Samuels-MacBook-Pro:~/Documents/GitHub/AgileVentures/projectscope_mvp (11_add_encrypted_attributes)]$ 
→ gpg -r tansaku@gmail.com --passphrase "$APP_SECRET" --output config/application.yml.asc --armor --encrypt config/application.yml
File `config/application.yml.asc' exists. Overwrite? (y/N) y

I wonder if things aren't a little more complicated than they need to be. Or maybe we're just exposing my personal flaws in terms of gpg familiarity :-) I guess with the right test key encrypted other developers PR requests will pass, but they won't be able to get green builds locally without the APP_SECRET - we found a similar setup was a barrier to onboarding OS developers on autograders and it's one of the reasons autograders is difficult to support as a project. (sorry if I'm failing to draw together all these notes in one place - I'm just thinking out loud as I work through the setup). I've spent an hour or two on this now (and I've set the APP_SECRET on travis), but the build is still failing with:

Setting environment variables from repository settings
$ export APP_SECRET=[secure]
...
0.02s$ gpg --passphrase "$APP_SECRET" --output config/application.yml --decrypt config/application.yml.asc
gpg: encrypted with RSA key, ID 4B0B14CD
gpg: decryption failed: secret key not available
The command "gpg --passphrase "$APP_SECRET" --output config/application.yml --decrypt config/application.yml.asc" failed and exited with 2 during .

Now maybe I did the wrong thing locally when I created a key associated with myself, or something about the way I set the travis APP_SECRET is wrong, but I'm currently blocked by this in terms of getting CI to pass.

I might be wrong, but couldn't we just specify the production attr_encrypted_key in the heroku (and then set dummy vars for test/development), rather than APP_SECRET? and then we wouldn't need an encrypted application.yml at all?

I think the feature we are working on might be described like so:

As a projectscope admin
So that sensitive data in project metric config is not exposed
I want to ensure that project metric config is encrypted when stored in the database

So the critical thing (as I think you pointed out), is that if we're supporting this with attr_encrypted, then what we must do is not share the production attr_encrypted key with anyone. However, that can be set by hand in the Heroku environment, and that's the only place it's actually needed right? The attr_encrypted key for development and test (inc. on Travis) can be anything as long as they are the right length.

Maybe there are other concerns/features here that make us want to have the attr_encrypted key encoded in the application.yml and require an APP_SECRET to unlock. A clear advantage of figaro and an application.yml is that we can easily push that attr_encrypted key to a new heroku instance, but how often will we want to roll out new heroku instances, and actually wouldn't it be more secure if they each had different attr_encrypted keys? Perhaps you're really keen that the key is in GitHub version control? I feel like Heroku config vars are pretty safe things - but good to have a backup I guess?

I guess other stories might include:

As a maintainer of the OS projectscope project
So that it can be maintained in the long term
I'd like OS developers who discover the repo to be able to check out and run the tests green locally
As a projectscope admin
So that I don't lose the ability to decrypt and/or change project metric configs
I want to ensure that relevant keys are stored in multiple locations (e.g. heroku and github)
As a projectscope admin
So that I can quickly roll out new instances of projectscope with interchangeable databases
I want to ensure that the same APP_SECRET key can be simply deployed to new projectscope instances

apologies for the length of this, or for lack of SMARTness in the stories. @mtc2013 is keen for me to get integrating his new gems. I can start on that in dev/test by essentially removing the .travis dependency on gpg to get green build by setting the attr_encrypted env var directly as I've done in the other PR ... I guess what I should probably do next is check that I can get a deploy on heroku ... (maybe make a staging server) if I can do that I'll just go on with some integration attempts ...

So should I not post this until I've checked heroku? I'm travelling so I can easily get interrupted ... I'll log this and follow up when I can ...

tansaku commented 8 years ago

Just deployed a staging server:

accessible here: https://projectscope-mvp-staging.herokuapp.com/projects

used the following to insert the attr_encrypted var:

$ figaro heroku:set -e production --remote staging

but I could just have easily have set by hand - and I had to set the GITHUB_ACCESS_TOKEN by hand. If we do end up with a lot of sensitive tokens in application.yml then figaro provides a nice way to set them, but the approach I'm used to is just having them in .env files (which are not checked in) and then setting them by hand since it's relatively rare that we roll out complete new instances.

I used to use the heroku_secrets gem but it's not really being maintained (perhaps we should pull that into AV - since the PRs aren't being merged) ...

I guess the advantage of encrypting application.yml is that we have a copy of it in version control and thus a mechanism to share the vars (rather than just pinging sets of them over slack in .env or .yml files)

Best of both worlds is having application.yml.asc checked in if we want to share around lots of tokens for things like github, and have the test/devel keys hard coded so that average OS developer can get a green build ... but sticking point for CI ATM is unencrypting the application.yml.asc, but I think we don't actually need it ... and also we should probably be using our own independent keys - I don't think there's any reason we need to all have access to the same attr_encrypted key unless we want to be exporting the heroku db and running with it locally (but everyone who has access to the heroku app will be able to export that if they need it).

Sorry, more just thinking out loud ...

armandofox commented 8 years ago

i'l try to followup in detail later, but i think you may be doing something more complex than needed.

i've used this scheme on several apps and it's much simpler than what i see above.

the .asc file is created once; its symmetric key is distributed to the developers and seeded into config vars on Heroku and CI; and in general that file never needs to be touched again, nor any of the gpg steps associated with it need to be done again, unless some of the keys in the config file itself have reason to change (or new ones are added), which should be quite rare.

so something's wrong. i recognize i may have erred in seeding keys in there that are less than 32 bytes, but other than that error, this should be a quite easy process.

tansaku commented 8 years ago

@armandofox I can see how this is quite an easy process in principle, however I am concerned that it also introduces a potential block to OS developers that we don't "bless" with the app secret.

However I think that by adding the test env var for attr_encrypted as I do in the other PR I think I get round that problem, and also bypass the problem I'm having getting this to work with Travis.

That PR is built directly off this, so I'll close this one and merge that other one. I think to all intents and purposes we should get the best of both worlds ...

armandofox commented 8 years ago

the fundamental issue is that app data such as a global github API key is, in fact, a secret. whoever has a Github API key genreated by me can act as me on Github (with respect to certain actions at least). so we do need to at least marginally 'vet' whoever works on the app, even if 'vetting' consists of saying 'you seem OK, here's the app secret'.

another alternative: for develping locally and testing, we can provide a checked-in config/application.yml containing secrets DIFFERENT from those used in CI and production (for production and CI, Travis and Heroku will have a known value for the key that decrypts config/application.yml.asc, so no obstacle there). basically someone would have to be willing to version in cleartext a "disposable" set of API keys to be used by the app. i'm not sure i'm willing to do that if the key was generated with my credentials.

i don't think it's too big a deal to share a single secret with developers who are "blessed" to contribute, but if you want to remove that obstacle, feel free to provide a global Github/etc API key belonging to your account :-)