Closed drnic closed 7 years ago
This PR has all of https://github.com/starkandwayne/shield-boshrelease/pull/75 fixes included. Thanks @frodenas for the https://github.com/starkandwayne/shield-boshrelease/pull/75/commits/5f7bab2719dd9267fc6e6d13f740495c0b0e3429 idea. I had seen this timing issue once.
@drnic Beside the above comments, I don't understand why you want to add a shield-target-postgres
and shield-target-redis
jobs. You can do exactly the same by:
1) collocating the shield-agent
job (and the backup/restore will take place at the same vm where postgres or redis resides)
2) converting this to a generic errand (that just registers the job)
Another problem that I see by using these jobs is that they don't include the shield-agent
package, and they're registering the job using the vm ip address. So if you don't add the shield-agent
, the backup will fail because the shield-daemon
will try to contact that ip address and there's no shield-agent
to respond.
@frodenas (and thanks @geofffranks for chat) ok I've gotten rid of shield-target-{postgres,redis}
- I believe I got lost in the assumed complexity of links; ultimately the two jobs had to connect to postgres/redis without links anyway.
See postgres.yml + redis.yml examples now https://github.com/starkandwayne/shield-boshrelease/tree/links-and-cloud-config/manifests/target-examples
They are using the shield-agent
's targets
and jobs
properties.
I'm concerned that I've not got ssl correctly configured. Should the following curl command work if I got it right; or is my sanity test wrong?
mkdir tmp/ssh
chmod 700 tmp/ssh
./bin/shield-cacert > tmp/ssh/cacert
chmod 600 tmp/ssh/cacert
curl -u admin:$(./bin/shield-password) https://10.244.0.2 --cacert tmp/ssh/cacert -v
The output is:
* Rebuilt URL to: https://10.244.0.2/
* Trying 10.244.0.2...
* Connected to 10.244.0.2 (10.244.0.2) port 443 (#0)
* WARNING: using IP address, SNI is being disabled by the OS.
* SSL: certificate verification failed (result: 5)
* Closing connection 0
curl: (51) SSL: certificate verification failed (result: 5)
@frodenas @geofffranks I believe the PR is updated based on feedback from both of you. Additional round of reviews/feedback is warmly welcome!
I'm now getting this in Chrome. I don't get it in Safari. Ideas?
On Safari
Added testflight-links
(to bosh-lite49 with cloud-config)
Rebased against v6.8.0
With the 6.x branch of the BOSH release safely tucked away for the production users, I have no reservations merging this, if it's ready. @drnic - anything left to do / test / fix on this?
I'd like someone to check the setup of SSL certs. On my own deployments, Chrome is unhappy.
(this is resolved - issue was the need for alternative_names:
to contain the IPs)
I'm getting this when I deploy the shield.yml to my BOSH-lite:
2017-05-15 14:44:49.368763104 +0000 UTC shieldd: ERROR: worker 5 unable to read user key /var/vcap/jobs/shield-daemon/shared/id_rsa: ssh: no key found; bailing out.
@jhunt when you did your deploy, did you use --vars-store tmp/creds.yml
or a bosh with credhub? At a guess, this is an error because your /var/vcap/jobs/shield-daemon/shared/id_rsa
file contains nonsense: specifically I bet that file contains the string ((shield-daemon-sshkey.private_key))
.
Unfortunately bosh2 deploy
& director do not error if any variables are not provided before deployment is commenced.
Ok, confirmed that SSL certs are working. Needed a way to add public IP into the cert:
bosh2 deploy manifests/shield.yml \
--vars-store tmp/creds.yml \
-o manifests/operators/tls-alternative-name.yml -v tls-alternative-name=10.58.111.50
And then the following works as expected:
bosh2 int tmp/creds.yml --path /shield-tls/ca > tmp/ssh/cacert-lite50
curl https://10.58.111.50:8443/ -u admin:$(bosh2 int tmp/creds.yml --path /shield-daemon-password) --cacert tmp/ssh/cacert-lite50
@jhunt I feel like this PR is good now.
I've got a draft doc/upgrade-to-cloud-config.md
doc; but it focuses on the only known starting point - templates/make_manifest warden
which is not a production upgrade scenario.
So perhaps once we have a shield v7.0.0.rc-1 or similar, we can look at some production upgrades and see what notes would be helpful?
I did not have a credentials storage (file or credhub).
Sure would be nice if BOSH would catch these errors. Until then, how do you feel about a change to check for errors in templates?
18:20:57 | Error: Unable to render instance groups for deployment. Errors are:
- Unable to render jobs for instance group 'shield'. Errors are:
- Unable to render templates for job 'shield-daemon'. Errors are:
- Error filling in template 'id_rsa' (line 6: ssh_private_key '((shield-daemon-sshkey.private_key))' does not look like an RSA private key)
Is there a way to generate the credentials (without using credhub) in-repo?
This PR adds bosh links to jobs, a new manifest that assumes cloud-config, and includes two new jobs for simple targeting/backing up of a postgres or redis. This release now assumes using
bosh2
. The new manifest generates passwords/certs.To test/demo:
bosh2 instances
to get the IP (e.g.10.244.0.2
).bosh2 int tmp/shield-creds.yml --path /shield-daemon-password
to get basic auth password (user isadmin
)To add
s3
as a store:To test postgres with backups:
To test redis with backups:
Targets will look like:
These manifests have also been tested with credhub.
Todo