autopilotpattern / mysql

Implementation of the autopilot pattern for MySQL
Mozilla Public License 2.0
172 stars 68 forks source link

Manta client doesn't respect MANTA_TLS_INSECURE #57

Closed neuroserve closed 8 years ago

neuroserve commented 8 years ago

It seems, that there's a problem uploading xtrabackup-files to a local manta in my lab. As I have (not yet) an official ssl cert for Manta, I added MANTA_TLS_INSECURE=1 to my Manta environment, and the variable is transferred into the container, as well. But "mantash" does not work inside the container until I create ~./ssh/key and ~./ssh/key.pub.

This is the error from "docker logs":

2016/09/21 18:24:19 160921 18:24:19 completed OK!
2016/09/21 18:24:19 INFO manage snapshot completed, uploading to object store
[...]
2016/09/21 18:24:20 httplib2.SSLHandshakeError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)
2016/09/21 18:24:20     2016/09/21 18:24:20 [ERR] http: Request PUT /v1/agent/check/pass/mysql-1452d7b4bb3a?note=ok, error: CheckID does not have associated TTL from=127.0.0.1:56359
2016/09/21 18:24:20 Unexpected response code: 500 (CheckID does not have associated TTL)
2016/09/21 18:24:20     2016/09/21 18:24:20 [INFO] agent: Synced service 'mysql-1452d7b4bb3a'
2016/09/21 18:24:20     2016/09/21 18:24:20 [INFO] agent: Synced check 'mysql-1452d7b4bb3a'
2016/09/21 18:24:20 ERROR manage Replica is not replicating.
2016/09/21 18:24:20     2016/09/21 18:24:20 [INFO] agent: Synced check 'mysql-1452d7b4bb3a'
2016/09/21 18:24:24 ERROR manage Replica is not replicating.
2016/09/21 18:24:29 ERROR manage Replica is not replicating.
2016/09/21 18:24:34 ERROR manage Replica is not replicating.

The backup.tar is created but it cannot be uploaded to Manta.

Is there a chance to get it to work with a self-signed cert (if that's the reason for the error)? Or do I simply need an official one?

Thx.

tgross commented 8 years ago

When you say mantash isn't working inside the container are you getting the error described here for python-manta? https://github.com/joyent/python-manta#x509-certificate-routinesx509_load_cert_crl_file-error.

I'll be honest and say I've never tested this with a self-signed cert so this might be an intentional limitation of the python-manta library. But I don't think so -- I dug into the source a bit and it looks like we're passing the disable_ssl_certificate_validation parameter to httplib2 from the MANTA_TLS_INSECURE variable correctly so I'm not sure where it's getting dropped. Can you provide the exact command line, environment vars, and messages you're getting back from mantash? That way we can see if it's something we need to fix here or whether it's something we can help @trentm (the main python-manta maintainer) with.


Somewhat unrelated:

2016/09/21 18:24:20 httplib2.SSLHandshakeError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)

Is this the only error? I feel like we should be louder about this.

neuroserve commented 8 years ago

Nope. Its not the "x509 certificate routines:X509_load_cert_crl_file error" error. It is:

root@86c2ccad4c59:/# env |grep -i manta
MANTA_USER=myuser
MANTA_KEY_ID=c7:6d:b4:e9:f9:33:44:7d:cb:6e:58:41:b6:b6:7f:c7
MANTA_ROLE=
MANTA_BUCKET=/myuser/stor/triton-mysql
MANTA_URL=https://10.64.243.97
MANTA_SUBUSER=
MANTA_PRIVATE_KEY=-----BEGIN RSA PRIVATE KEY-----#<keydata>#-----END RSA PRIVATE KEY-----
MANTA_TLS_INSECURE=1
root@86c2ccad4c59:/# mantash ls
mantash: ERROR: could not find key info for signing: no ssh-agent key with fingerprint "c7:6d:b4:e9:f9:33:44:7d:cb:6e:58:41:b6:b6:7f:c7"; no '~/.ssh/*.pub' key found with fingerprint 'c7:6d:b4:e9:f9:33:44:7d:cb:6e:58:41:b6:b6:7f:c7'

As soon as I paste $MANTA_PRIVATE_KEY into .ssh/key and paste my ssh-pub key into .ssh/key.pub "mantash ls" works without a problem (MANTA_TLS_INSECURE=1 is set). As soon as I unset MANTA_TLS_INSECURE, mantash stops working.

Obviously mantash is reading my MANTA_KEY_ID from the environment and it's looking for key and key.pub in the .ssh-directory (which are obviously not there).

Here's the complete error message from "docker logs 86c2ccad4c59":

2016/09/21 18:34:30 INFO manage snapshot completed, uploading to object store
2016/09/21 18:34:31 Traceback (most recent call last):
2016/09/21 18:34:31   File "/usr/local/bin/manage.py", line 486, in <module>
2016/09/21 18:34:31     main()
2016/09/21 18:34:31   File "/usr/local/bin/manage.py", line 483, in main
2016/09/21 18:34:31     cmd(node)
2016/09/21 18:34:31   File "/usr/local/bin/manager/utils.py", line 64, in wrapper
2016/09/21 18:34:31     out = apply(fn, args, kwargs)
2016/09/21 18:34:31   File "/usr/local/bin/manage.py", line 131, in health
2016/09/21 18:34:31     assert_initialized_for_state(node)
2016/09/21 18:34:31   File "/usr/local/bin/manager/utils.py", line 64, in wrapper
2016/09/21 18:34:31     out = apply(fn, args, kwargs)
2016/09/21 18:34:31   File "/usr/local/bin/manage.py", line 381, in assert_initialized_for_state
2016/09/21 18:34:31     if not run_as_primary(node):
2016/09/21 18:34:31   File "/usr/local/bin/manager/utils.py", line 64, in wrapper
2016/09/21 18:34:31     out = apply(fn, args, kwargs)
2016/09/21 18:34:31   File "/usr/local/bin/manage.py", line 442, in run_as_primary
2016/09/21 18:34:31     write_snapshot(node)
2016/09/21 18:34:31   File "/usr/local/bin/manager/utils.py", line 64, in wrapper
2016/09/21 18:34:31     out = apply(fn, args, kwargs)
2016/09/21 18:34:31   File "/usr/local/bin/manage.py", line 289, in write_snapshot
2016/09/21 18:34:31     create_snapshot(node)
2016/09/21 18:34:31   File "/usr/local/bin/manager/utils.py", line 64, in wrapper
2016/09/21 18:34:31     out = apply(fn, args, kwargs)
2016/09/21 18:34:31   File "/usr/local/bin/manage.py", line 320, in create_snapshot
2016/09/21 18:34:31     node.manta.put_backup(backup_id, '/tmp/backup.tar')
2016/09/21 18:34:31   File "/usr/local/bin/manager/libmanta.py", line 54, in put_backup
2016/09/21 18:34:31     self.client.put_object(mpath, file=f)
2016/09/21 18:34:31   File "/usr/local/lib/python2.7/dist-packages/manta/client.py", line 351, in put_object
2016/09/21 18:34:31     headers=headers)
2016/09/21 18:34:31   File "/usr/local/lib/python2.7/dist-packages/manta/client.py", line 210, in _request
2016/09/21 18:34:31     return http.request(url, method, ubody, headers)
2016/09/21 18:34:31   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1570, in request
2016/09/21 18:34:31     (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
2016/09/21 18:34:31   File "/usr/local/lib/python2.7/dist-packages/manta/client.py", line 100, in _request
2016/09/21 18:34:31     res, content = httplib2.Http._request(self, conn, host, absolute_uri, request_uri, method, body, headers, redirections, cachekey)
2016/09/21 18:34:31   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1317, in _request
2016/09/21 18:34:31     (response, content) = self._conn_request(conn, request_uri, method, body, headers)
2016/09/21 18:34:31   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1252, in _conn_request
2016/09/21 18:34:31     conn.connect()
2016/09/21 18:34:31   File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1044, in connect
2016/09/21 18:34:31     raise SSLHandshakeError(e)
2016/09/21 18:34:31 httplib2.SSLHandshakeError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)
2016/09/21 18:34:31     2016/09/21 18:34:31 [ERR] http: Request PUT /v1/agent/check/pass/mysql-86c2ccad4c59?note=ok, error: CheckID does not have associated TTL from=127.0.0.1:35245
2016/09/21 18:34:31 Unexpected response code: 500 (CheckID does not have associated TTL)
Service not registered, registering...
tgross commented 8 years ago

Ah, I think I misunderstood the original problem. The Manta API has mutual authentication. There are two components here:

So you're receiving the SSL handshake error when MANTA_TLS_INSECURE is unset because you don't have a CA-signed certificate, as you'd expect. But when you try to write to Manta without the SSH key you can't, because Manta doesn't know who you are.

neuroserve commented 8 years ago

But what do we learn from that? When manage.py tries to upload the mysql-backup to Manta it throws "SSL: CERTIFICATE_VERIFY_FAILED" (although MANTA_TLS_INSECURE=1 is set).In the environment we have MANTA_KEY_ID, MANTA_PRIVATE_KEY and MANTA_USER. When using Manta with the commands from the node.js world (like "mls", etc.) I don't even have to provide MANTA_PRIVATE_KEY. What am I missing? How am I supposed to transfer the required ssh-keys into the container (or am I)?

tgross commented 8 years ago

The keys are transferred from the environment directly into the python-manta client library by the manage.py application (when we instantiate the client here). Mantash doesn't come into this at all unless you're using it for debugging, in which case yes you need to provide it with the key directly.

But now that I've pointed that little tidbit out, I realize we're not passing thru the command line interface that might be parsing the MANTA_TLS_INSECURE, and sure enough that's exactly the problem. The constructor for our Manta client wrapper should look like:

class Manta(object):
    """
    The Manta class wraps access to the Manta object store, where we'll put
    our MySQL backups.
    """
    def __init__(self, envs=os.environ):
        self.account = env('MANTA_USER', None, envs)
        self.user = env('MANTA_SUBUSER', None, envs)
        self.role = env('MANTA_ROLE', None, envs)
        self.key_id = env('MANTA_KEY_ID', None, envs)
        self.url = env('MANTA_URL', 'https://us-east.manta.joyent.com', envs)
        self.bucket = env('MANTA_BUCKET', '/{}/stor'.format(self.account), envs)
        is_tls = env('MANTA_TLS_INSECURE', False)

        # we don't want to use `env` here because we have a different
        # de-munging to do
        self.private_key = envs.get('MANTA_PRIVATE_KEY', '').replace('#', '\n')
        self.signer = pymanta.PrivateKeySigner(self.key_id, self.private_key)
        self.client = pymanta.MantaClient(self.url,
                                          self.account,
                                          subuser=self.user,
                                          role=self.role,
                                          disable_ssl_certificate_validation=is_tls,
                                          signer=self.signer)

This is a pretty quick change. If you want to put up a PR for it I'd be happy to merge it, or I can try and hit it in the next day or so.

neuroserve commented 8 years ago

I'm not sure, whether my github and/or python skills are ready for a PR. I'll have a look over the weekend.

tgross commented 8 years ago

I took a crack at that in https://github.com/autopilotpattern/mysql/issues/57

neuroserve commented 8 years ago

I've checked out the patch and I was able to build the container but I was not able to start it (and I have no logs to see, what happened). Now I see, you merged the patch already (libmanta.py has the changes), but the container still has the old version, right?

tgross commented 8 years ago

Right, that hasn't been released yet. I'm trying to figure out why our 3rd-party test runner isn't working quite right #64 and then I'll cut a new release.

tgross commented 8 years ago

Released in https://github.com/autopilotpattern/mysql/releases/tag/5.6r3.1.0

tgross commented 8 years ago

@neuroserve autopilotpattern/mysql:latest or autopilotpattern/mysql:5.6r3.1.0 images will have this fix.

neuroserve commented 8 years ago

And works. Backups are now uploaded to Manta.