shieldproject / shield-boshrelease

BOSH Release for shield
MIT License
11 stars 17 forks source link

Links, simple manifest and redis/postgres target jobs #76

Closed drnic closed 7 years ago

drnic commented 7 years ago

This PR adds bosh links to jobs, a new manifest that assumes cloud-config, and includes two new jobs for simple targeting/backing up of a postgres or redis. This release now assumes using bosh2. The new manifest generates passwords/certs.

To test/demo:

git clone https://github.com/starkandwayne/shield-boshrelease -b links-and-cloud-config
cd shield-boshrelease
git submodule update --init
bosh2 create-release
bosh2 upload-release --rebase
bosh2 deploy manifests/shield.yml -d shield --vars-store=tmp/shield-creds.yml

To add s3 as a store:

bosh2 deploy manifests/shield.yml -d shield --vars-store=tmp/shield-creds.yml \
  -o manifests/operators/store-amazon-s3.yml \
  -v s3-access-key=x
  -v s3-secret-key=x
  -v s3-bucket=x
  -v s3-bucket-store-prefix=x

To test postgres with backups:

bosh2 deploy manifests/target-examples/postgres.yml --vars-store=tmp/postgres-creds.yml -d shield-target-postgres

To test redis with backups:

bosh2 deploy manifests/target-examples/redis.yml --vars-store=tmp/redis-creds.yml -d shield-target-redis

Targets will look like:

screen shot 2017-05-08 at 10 35 09 am

These manifests have also been tested with credhub.

Todo

drnic commented 7 years ago

This PR has all of https://github.com/starkandwayne/shield-boshrelease/pull/75 fixes included. Thanks @frodenas for the https://github.com/starkandwayne/shield-boshrelease/pull/75/commits/5f7bab2719dd9267fc6e6d13f740495c0b0e3429 idea. I had seen this timing issue once.

frodenas commented 7 years ago

@drnic Beside the above comments, I don't understand why you want to add a shield-target-postgres and shield-target-redis jobs. You can do exactly the same by:

1) collocating the shield-agent job (and the backup/restore will take place at the same vm where postgres or redis resides) 2) converting this to a generic errand (that just registers the job)

Another problem that I see by using these jobs is that they don't include the shield-agent package, and they're registering the job using the vm ip address. So if you don't add the shield-agent, the backup will fail because the shield-daemon will try to contact that ip address and there's no shield-agent to respond.

drnic commented 7 years ago

@frodenas (and thanks @geofffranks for chat) ok I've gotten rid of shield-target-{postgres,redis} - I believe I got lost in the assumed complexity of links; ultimately the two jobs had to connect to postgres/redis without links anyway.

See postgres.yml + redis.yml examples now https://github.com/starkandwayne/shield-boshrelease/tree/links-and-cloud-config/manifests/target-examples

They are using the shield-agent's targets and jobs properties.

drnic commented 7 years ago

I'm concerned that I've not got ssl correctly configured. Should the following curl command work if I got it right; or is my sanity test wrong?

mkdir tmp/ssh
chmod 700 tmp/ssh
./bin/shield-cacert > tmp/ssh/cacert
chmod 600 tmp/ssh/cacert
curl -u admin:$(./bin/shield-password) https://10.244.0.2 --cacert tmp/ssh/cacert -v

The output is:

* Rebuilt URL to: https://10.244.0.2/
*   Trying 10.244.0.2...
* Connected to 10.244.0.2 (10.244.0.2) port 443 (#0)
* WARNING: using IP address, SNI is being disabled by the OS.
* SSL: certificate verification failed (result: 5)
* Closing connection 0
curl: (51) SSL: certificate verification failed (result: 5)
drnic commented 7 years ago

@frodenas @geofffranks I believe the PR is updated based on feedback from both of you. Additional round of reviews/feedback is warmly welcome!

drnic commented 7 years ago

screen shot 2017-05-09 at 5 32 02 pm

I'm now getting this in Chrome. I don't get it in Safari. Ideas?

drnic commented 7 years ago

screen shot 2017-05-09 at 5 33 17 pm

On Safari

drnic commented 7 years ago

screen shot 2017-05-10 at 11 03 59 am

Added testflight-links (to bosh-lite49 with cloud-config)

drnic commented 7 years ago

Rebased against v6.8.0

jhunt commented 7 years ago

With the 6.x branch of the BOSH release safely tucked away for the production users, I have no reservations merging this, if it's ready. @drnic - anything left to do / test / fix on this?

drnic commented 7 years ago

I'd like someone to check the setup of SSL certs. On my own deployments, Chrome is unhappy.

(this is resolved - issue was the need for alternative_names: to contain the IPs)

jhunt commented 7 years ago

I'm getting this when I deploy the shield.yml to my BOSH-lite:

2017-05-15 14:44:49.368763104 +0000 UTC shieldd: ERROR: worker 5 unable to read user key /var/vcap/jobs/shield-daemon/shared/id_rsa: ssh: no key found; bailing out.
drnic commented 7 years ago

@jhunt when you did your deploy, did you use --vars-store tmp/creds.yml or a bosh with credhub? At a guess, this is an error because your /var/vcap/jobs/shield-daemon/shared/id_rsa file contains nonsense: specifically I bet that file contains the string ((shield-daemon-sshkey.private_key)).

Unfortunately bosh2 deploy & director do not error if any variables are not provided before deployment is commenced.

drnic commented 7 years ago

Ok, confirmed that SSL certs are working. Needed a way to add public IP into the cert:

bosh2 deploy manifests/shield.yml \
  --vars-store tmp/creds.yml \
  -o manifests/operators/tls-alternative-name.yml -v tls-alternative-name=10.58.111.50

And then the following works as expected:

bosh2 int tmp/creds.yml --path /shield-tls/ca > tmp/ssh/cacert-lite50
curl https://10.58.111.50:8443/ -u admin:$(bosh2 int tmp/creds.yml --path /shield-daemon-password) --cacert tmp/ssh/cacert-lite50
drnic commented 7 years ago

@jhunt I feel like this PR is good now.

I've got a draft doc/upgrade-to-cloud-config.md doc; but it focuses on the only known starting point - templates/make_manifest warden which is not a production upgrade scenario.

So perhaps once we have a shield v7.0.0.rc-1 or similar, we can look at some production upgrades and see what notes would be helpful?

jhunt commented 7 years ago

I did not have a credentials storage (file or credhub).

Sure would be nice if BOSH would catch these errors. Until then, how do you feel about a change to check for errors in templates?

18:20:57 | Error: Unable to render instance groups for deployment. Errors are:
   - Unable to render jobs for instance group 'shield'. Errors are:
     - Unable to render templates for job 'shield-daemon'. Errors are:
       - Error filling in template 'id_rsa' (line 6: ssh_private_key '((shield-daemon-sshkey.private_key))' does not look like an RSA private key)
jhunt commented 7 years ago

Is there a way to generate the credentials (without using credhub) in-repo?