Closed rwos closed 3 years ago
So this looks kind of ham-handed, but it will get the job done for anyone who is trying to use AWS_ENDPOINT instead of AWS_REGION
I would like to see a design document to go with this change per https://docs.teamhephy.info/contributing/design-documents/
(We should nonetheless maintain a merged branch for those with S3 endpoints that are not in AWS regions, IMHO)
@kingdonb I put a design doc here: https://github.com/teamhephy/workflow/issues/72 - let me know if that makes sense, or if anything is missing!
@rwos , you will have to rebase from master as we just merged in #4 where base image and some scripts changed while we switched to official postgresql image.
@Cryptophobia rebased!
Thanks, I hope to get to this soon.
I will be happy to test this on DigitalOcean tonight, if I can get the cluster to deploy this time
Let us know how it goes @kingdonb :popcorn: :movie_camera:
If I don't get around to the testing this evening, then definitely this weekend. This is the only thing stopping me from declaring DigitalOcean K8s support as "basically production-ready," in an experimental sense at least.
I am excited to test this! Can anyone report that you've already shown this to be working in a cluster? I'm assuming from the discussion that it is so, but haven't seen it for myself yet.
There are a number of components that demand S3API and they do not all work in the same way, I'd like to test them all together, if possible. But this is the big one.
@kingdonb FWIW, I did have all the parts running in a cluster, and I did run the end-to-end tests against them, with an off-cluster S3 (Virtuozzo Storage at first, currently Minio). Non-production setup though, entirely possible that there's still stuff I have overlooked.
I'm building it now... I just noticed DEIS_REGISTY
in the Makefile
2018-11-22 22:18:53.001 UTC [35] LOG: database system is ready to accept connections
done
server started
CREATE DATABASE
CREATE ROLE
/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/001_setup_envdir.sh
/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/002_create_bucket.sh
Traceback (most recent call last):
File "/bin/create_bucket", line 47, in <module>
if not bucket_exists(conn, bucket_name):
File "/bin/create_bucket", line 28, in bucket_exists
bucket = conn.lookup(name)
AttributeError: 'NoneType' object has no attribute 'lookup'
It is possible that I built it wrong, or made some other configuration error... I still don't have CI in place to test a change like this.
I will try your prebuilt images, @rwos
I tried again after reading the patch and realizing I probably was meant to include "https://" in my S3_ENDPOINT value.
This time I get (after some time elapses):
CREATE DATABASE
CREATE ROLE
/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/001_setup_envdir.sh
/docker-entrypoint.sh: running /docker-entrypoint-initdb.d/002_create_bucket.sh
Traceback (most recent call last):
File "/bin/create_bucket", line 52, in <module>
conn.create_bucket(bucket_name, location=region)
File "/usr/local/lib/python3.5/dist-packages/boto/s3/connection.py", line 619, in create_bucket
data=data)
File "/usr/local/lib/python3.5/dist-packages/boto/s3/connection.py", line 671, in make_request
retry_handler=retry_handler
File "/usr/local/lib/python3.5/dist-packages/boto/connection.py", line 1071, in make_request
retry_handler=retry_handler)
File "/usr/local/lib/python3.5/dist-packages/boto/connection.py", line 1030, in _mexe
raise ex
File "/usr/local/lib/python3.5/dist-packages/boto/connection.py", line 943, in _mexe
request.body, request.headers)
File "/usr/lib/python3.5/http/client.py", line 1107, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python3.5/http/client.py", line 1152, in _send_request
self.endheaders(body)
File "/usr/lib/python3.5/http/client.py", line 1103, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.5/http/client.py", line 934, in _send_output
self.send(msg)
File "/usr/lib/python3.5/http/client.py", line 877, in send
self.connect()
File "/usr/lib/python3.5/http/client.py", line 1261, in connect
server_hostname=server_hostname)
File "/usr/lib/python3.5/ssl.py", line 385, in wrap_socket
_context=self)
File "/usr/lib/python3.5/ssl.py", line 760, in __init__
self.do_handshake()
File "/usr/lib/python3.5/ssl.py", line 996, in do_handshake
self._sslobj.do_handshake()
File "/usr/lib/python3.5/ssl.py", line 641, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:720)
At this point I believe there is some fine-tuning needed, at least, or documentation about the feature... if I remove the "https" and use an unencrypted s3 endpoint, it does appear to reach the bucket.
I exec'd into the container to see if I could diagnose, and since we've used the upstream image I can see that wget and curl are missing. Fortunately the failure takes some time to timeout, so I had enough time to install it... and finding that ca-certificates package is missing.
Without the encryption, I get: ssl.SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:720)
I'd say the client is trying to enforce SSL, which is probably good. So chances are, simply adding the ca-certificates package to this image is going to resolve the situation.
That was the issue, now I have:
<?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidArgument</Code><Message>Server Side Encryption with KMS managed key requires HTTP header x-amz-server-side-encryption : aws:kms</Message><BucketName>hephy-rocks-deis-database</BucketName><RequestId>tx00000000000000616c94e-005bf87191-79b1a-sfo2a</RequestId><HostId>79b1a-sfo2a-sfo</HostId></Error>
It's going to be hard to ensure the database backup is KMS encrypted, when I'm using DigitalOcean spaces... there is a line in rootfs/patcher-script.d/patch_wal_e_s3.py which requests this encryption if you asked for "s3" in DATABASE_STORAGE, which I assume is set from global.storage in values.yaml
I've just converted it to blanket "False" for my own testing and we'll see if that makes it usable. I wonder if this has changed at all in newer versions of boto?
2018-11-23 21:40:05.977 UTC [1] LOG: database system is ready to accept connections
It's unencrypted but it works! The code is not mergeable though, we need a way to detect that we're using something other than AWS and KMS should not be requested.
I'm afraid of the idea of making a setting for S3_REGION
analogous to saying "I am not using AWS and don't encrypt my database backups" though, because no matter how many times we write in the values.yaml "ONLY SET S3_REGION IF YOU ARE NOT USING AWS" someone is going to set it anyway, and they'll probably prefer to have them encrypted as the backups may store some sensitive bits.
@yebyen @kingdonb #5
Ca-certificates have been added to postgres: 11-alpine.
part of https://github.com/teamhephy/workflow/issues/52