ClusterHQ / flocker

Container data volume manager for your Dockerized application
https://clusterhq.com
Apache License 2.0
3.39k stars 290 forks source link

Unit flocker-dataset-agent.service entered failed state. #2885

Closed akamalov closed 8 years ago

akamalov commented 8 years ago

Greetings,

I am having a problem with trying to start flocker-dataset-agent service. Brief background: the environment consists of RHEL7.2 attempting to access PureStorage backend via iSCSI. PureStorage drivers have been installed and test (which I will go further below). Let me first display environmentals:

Operating System:

NAME="Red Hat Enterprise Linux Server"
VERSION="7.2 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="7.2"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.2 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.2:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.2
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.2"

Flocker environment:

clusterhq-python-flocker-1.13.0-1.x86_64
clusterhq-flocker-docker-plugin-1.13.0-1.noarch
clusterhq-flocker-node-1.13.0-1.noarch

/etc/flocker/agent.yml

"version": 1
"control-service": 
  "hostname": 192.168.120.156 
  "port": 4523
"dataset": 
    "backend": "purestorage_flasharray_flocker_driver"
    "pure_ip": 192.168.128.157 
    "pure_api_token": "50cac451-fefe-b635-9692-5870aada9c49"
    "pure_chap_host_user": "server15"
    "pure_chap_host_password": "XXXXXX"

PureStorage Driver:

PureStorage Driver GitHub link - https://github.com/PureStorage-OpenConnect/purestorage-flocker-driver

Restart Flocker services, display current status:

##########################
● flocker-container-agent.service - Flocker Container Agent
   Loaded: loaded (/usr/lib/systemd/system/flocker-container-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2016-08-15 08:58:04 EDT; 5min ago
 Main PID: 29760 (flocker-contain)
   Memory: 63.4M
   CGroup: /system.slice/flocker-container-agent.service
           └─29760 /opt/flocker/bin/python /usr/sbin/flocker-container-agent --journald

Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"task_uuid": "f1cb436f-5845-4152-b500-3bc59c636f4b", "error": false, "timestamp": 1471266234.010053, "message": "AgentAMP connection established (HOST:IPv4Address(TCP, '192.168.120.165', 52992) PEER:IPv4Address(TCP, '192.168.120.156', 4523))", "message_type": "twisted:log", "task_level": [1]}
Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"fsm_identifier": "<flocker.node._loop.ClusterStatus object at 0x45f4110>", "fsm_input": "<ClusterStatusInputs=CONNECTED_TO_CONTROL_SERVICE>", "timestamp": 1471266234.01245, "fsm_rich_input": "<_ConnectedToControlService>", "action_status": "started", "task_uuid": "789f2609-d355-49d7-9e76-4cfb1db67acc", "action_type": "fsm:transition", "fsm_state": "<ClusterStatusStates=DISCONNECTED>", "task_level": [1]}
Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"fsm_next_state": "<ClusterStatusStates=IGNORANT>", "task_level": [2], "action_type": "fsm:transition", "timestamp": 1471266234.013306, "fsm_output": ["<ClusterStatusOutputs=STORE_CLIENT>"], "task_uuid": "789f2609-d355-49d7-9e76-4cfb1db67acc", "action_status": "succeeded"}
Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"task_uuid": "d11cc085-6f78-4523-9246-2149a23ae968", "error": false, "timestamp": 1471266234.116931, "message": "AgentAMP connection lost (HOST:IPv4Address(TCP, '192.168.120.165', 52992) PEER:IPv4Address(TCP, '192.168.120.156', 4523))", "message_type": "twisted:log", "task_level": [1]}
Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"exception": "OpenSSL.SSL.Error", "reason": "[('SSL routines', 'SSL3_READ_BYTES', 'sslv3 alert certificate unknown'), ('SSL routines', 'SSL3_WRITE_BYTES', 'ssl handshake failure')]", "timestamp": 1471266234.118531, "traceback": "Traceback: <class 'OpenSSL.SSL.Error'>: [('SSL routines', 'SSL3_READ_BYTES', 'sslv3 alert certificate unknown'), ('SSL routines', 'SSL3_WRITE_BYTES', 'ssl handshake failure')]\n/opt/flocker/lib/python2.7/site-packages/twisted/internet/posixbase.py:597:_doReadOrWrite\n/opt/flocker/lib/python2.7/site-packages/twisted/internet/tcp.py:209:doRead\n/opt/flocker/lib/python2.7/site-packages/twisted/internet/tcp.py:215:_dataReceived\n/opt/flocker/lib/python2.7/site-packages/twisted/protocols/tls.py:421:dataReceived\n--- <exception caught here> ---\n/opt/flocker/lib/python2.7/site-packages/twisted/protocols/tls.py:569:_write\n/opt/flocker/lib/python2.7/site-packages/OpenSSL/SSL.py:1271:send\n/opt/flocker/lib/python2.7/site-packages/OpenSSL/SSL.py:1191:_raise_ssl_error\n/opt/flocker/lib/python2.7/site-packages/OpenSSL/_util.py:48:exception_from_error_queue\n", "message_type": "eliot:traceback", "task_uuid": "a49e6ac7-0aa4-43b4-a129-111b66af8f31", "task_level": [1]}
Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"fsm_identifier": "<flocker.node._loop.ClusterStatus object at 0x45f4110>", "fsm_input": "<ClusterStatusInputs=DISCONNECTED_FROM_CONTROL_SERVICE>", "timestamp": 1471266234.12051, "fsm_rich_input": null, "action_status": "started", "task_uuid": "b670b5d0-86d3-4899-9f09-89bc2a8b0fd0", "action_type": "fsm:transition", "fsm_state": "<ClusterStatusStates=IGNORANT>", "task_level": [1]}
Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"fsm_next_state": "<ClusterStatusStates=DISCONNECTED>", "task_level": [2], "action_type": "fsm:transition", "timestamp": 1471266234.121409, "fsm_output": [], "task_uuid": "b670b5d0-86d3-4899-9f09-89bc2a8b0fd0", "action_status": "succeeded"}
Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"task_uuid": "6793744e-5b23-434b-a18e-2752789c0acc", "error": false, "timestamp": 1471266234.122539, "message": "<twisted.internet.tcp.Connector instance at 0x45ef7e8> will retry in 2 seconds", "message_type": "twisted:log", "task_level": [1]}
Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"task_uuid": "31b1fe22-151f-419c-9b2f-7dce6c19772f", "error": false, "timestamp": 1471266234.123821, "message": "Stopping factory <twisted.internet.protocol.ReconnectingClientFactory instance at 0x45f6518>", "message_type": "twisted:log", "task_level": [1]}
Aug 15 09:03:54 server15 flocker-container-agent[29760]: {"task_uuid": "6d217799-8e1f-4b2f-8319-5ce55e26b5f0", "error": true, "timestamp": 1471266234.176803, "message": "Unhandled Error\nTraceback (most recent call last):\n  File \"/opt/flocker/lib/python2.7/site-packages/flocker/common/script.py\", line 295, in main\n    self._react(run_and_log, [], _reactor=self._reactor)\n  File \"/opt/flocker/lib/python2.7/site-packages/twisted/internet/task.py\", line 936, in react\n    _reactor.run()\n  File \"/opt/flocker/lib/python2.7/site-packages/twisted/internet/base.py\", line 1194, in run\n    self.mainLoop()\n  File \"/opt/flocker/lib/python2.7/site-packages/twisted/internet/base.py\", line 1203, in mainLoop\n    self.runUntilCurrent()\n--- <exception caught here> ---\n  File \"/opt/flocker/lib/python2.7/site-packages/twisted/internet/base.py\", line 825, in runUntilCurrent\n    call.func(*call.args, **call.kw)\n  File \"/opt/flocker/lib/python2.7/site-packages/flocker/control/_protocol.py\", line 455, in <lambda>\n    lambda: protocol.transport.abortConnection())\nexceptions.AttributeError: 'NoneType' object has no attribute 'abortConnection'\n", "message_type": "twisted:log", "task_level": [1]}

##############################

● flocker-dataset-agent.service - Flocker Dataset Agent
   Loaded: loaded (/usr/lib/systemd/system/flocker-dataset-agent.service; enabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Mon 2016-08-15 08:58:11 EDT; 5min ago
  Process: 29869 ExecStart=/usr/sbin/flocker-dataset-agent --journald (code=exited, status=1/FAILURE)
 Main PID: 29869 (code=exited, status=1/FAILURE)

Aug 15 08:58:11 server15 systemd[1]: Unit flocker-dataset-agent.service entered failed state.
Aug 15 08:58:11 server15 systemd[1]: flocker-dataset-agent.service failed.
Aug 15 08:58:11 server15 systemd[1]: flocker-dataset-agent.service holdoff time over, scheduling restart.
Aug 15 08:58:11 server15 systemd[1]: start request repeated too quickly for flocker-dataset-agent.service
Aug 15 08:58:11 server15 systemd[1]: Failed to start Flocker Dataset Agent.
Aug 15 08:58:11 server15 systemd[1]: Unit flocker-dataset-agent.service entered failed state.
Aug 15 08:58:11 server15 systemd[1]: flocker-dataset-agent.service failed.

##############################

● flocker-docker-plugin.service - Flocker Docker Plugin
   Loaded: loaded (/usr/lib/systemd/system/flocker-docker-plugin.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2016-08-15 08:58:04 EDT; 5min ago
 Main PID: 29781 (flocker-docker-)
   Memory: 62.4M
   CGroup: /system.slice/flocker-docker-plugin.service
           └─29781 /opt/flocker/bin/python /usr/sbin/flocker-docker-plugin --journald

Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"exception": "flocker.apiclient._client.NotFound", "task_level": [1592, 4], "action_type": "flocker:apiclient:http_request", "reason": "{\"description\": \"No node found with given era.\"}", "timestamp": 1471266234.313487, "task_uuid": "2dd0b28b-cdc9-434a-9dff-a592c304b836", "action_status": "failed"}
Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"task_uuid": "c731df94-e1a3-4d88-a911-c3c1d85d971e", "error": false, "timestamp": 1471266234.316437, "message": "Stopping factory <twisted.web.client._HTTP11ClientFactory instance at 0x5b134d0>", "message_type": "twisted:log", "task_level": [1]}
Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"request_body": null, "url": "https://mmaster1:4523/v1/state/nodes/by_era/61181a3e-978a-4e8c-a2b1-7f5e4ae1bf54", "timestamp": 1471266234.415257, "action_status": "started", "task_uuid": "2dd0b28b-cdc9-434a-9dff-a592c304b836", "action_type": "flocker:apiclient:http_request", "method": "GET", "task_level": [1593, 1]}
Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"task_uuid": "2dd0b28b-cdc9-434a-9dff-a592c304b836", "error": false, "timestamp": 1471266234.428725, "message": "Starting factory <twisted.web.client._HTTP11ClientFactory instance at 0x5a2e128>", "message_type": "twisted:log", "task_level": [1593, 3]}
Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"exception": "flocker.apiclient._client.NotFound", "task_level": [1593, 4], "action_type": "flocker:apiclient:http_request", "reason": "{\"description\": \"No node found with given era.\"}", "timestamp": 1471266234.531891, "task_uuid": "2dd0b28b-cdc9-434a-9dff-a592c304b836", "action_status": "failed"}
Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"task_uuid": "37ffc3fe-c4ee-4bf0-b30a-ef0b1a900942", "error": false, "timestamp": 1471266234.534984, "message": "Stopping factory <twisted.web.client._HTTP11ClientFactory instance at 0x5a2e128>", "message_type": "twisted:log", "task_level": [1]}
Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"request_body": null, "url": "https://mmaster1:4523/v1/state/nodes/by_era/61181a3e-978a-4e8c-a2b1-7f5e4ae1bf54", "timestamp": 1471266234.633406, "action_status": "started", "task_uuid": "2dd0b28b-cdc9-434a-9dff-a592c304b836", "action_type": "flocker:apiclient:http_request", "method": "GET", "task_level": [1594, 1]}
Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"task_uuid": "2dd0b28b-cdc9-434a-9dff-a592c304b836", "error": false, "timestamp": 1471266234.644697, "message": "Starting factory <twisted.web.client._HTTP11ClientFactory instance at 0x5b114d0>", "message_type": "twisted:log", "task_level": [1594, 3]}
Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"exception": "flocker.apiclient._client.NotFound", "task_level": [1594, 4], "action_type": "flocker:apiclient:http_request", "reason": "{\"description\": \"No node found with given era.\"}", "timestamp": 1471266234.732458, "task_uuid": "2dd0b28b-cdc9-434a-9dff-a592c304b836", "action_status": "failed"}
Aug 15 09:03:54 server15 flocker-docker-plugin[29781]: {"task_uuid": "4dceda0b-7b13-47d3-b17f-1a7fbeed1a50", "error": false, "timestamp": 1471266234.733952, "message": "Stopping factory <twisted.web.client._HTTP11ClientFactory instance at 0x5b114d0>", "message_type": "twisted:log", "task_level": [1]}

Trying to get more information on flocker-dataset-agent service:

[root@server1 ~]# journalctl -u flocker-dataset-agent.service -l -r
-- Logs begin at Sat 2016-08-13 23:16:11 EDT, end at Mon 2016-08-15 09:06:33 EDT. --
Aug 15 08:58:11 server1 systemd[1]: flocker-dataset-agent.service failed.
Aug 15 08:58:11 server1 systemd[1]: Unit flocker-dataset-agent.service entered failed state.
Aug 15 08:58:11 server1 systemd[1]: Failed to start Flocker Dataset Agent.
Aug 15 08:58:11 server1 systemd[1]: start request repeated too quickly for flocker-dataset-agent.service
Aug 15 08:58:11 server1 systemd[1]: flocker-dataset-agent.service holdoff time over, scheduling restart.
Aug 15 08:58:11 server1 systemd[1]: flocker-dataset-agent.service failed.
Aug 15 08:58:11 server1 systemd[1]: Unit flocker-dataset-agent.service entered failed state.
Aug 15 08:58:11 server1 systemd[1]: flocker-dataset-agent.service: main process exited, code=exited, status=1/FAILURE
Aug 15 08:58:11 server1 flocker-dataset-agent[29869]: {"task_uuid": "d10519ee-b417-4ccd-a90c-7c780a4d5b14", "error": false, "timestamp": 1471265891.600116, "message": "Main loop terminated.", "message_type": "twisted:log", "task_level": [1]}
Aug 15 08:58:11 server1 flocker-dataset-agent[29869]: {"task_uuid": "8a2f9dff-07d1-4aab-b8c4-717536adf44e", "error": true, "timestamp": 1471265891.599722, "message": "main function encountered error\nTraceback (most recent call last):\n  File \"/opt/floc
Aug 15 08:58:11 server1 flocker-dataset-agent[29869]: {"task_uuid": "f7b0c1cf-eefd-4007-8742-156177a62b0a", "error": true, "timestamp": 1471265891.598659, "message": "Unhandled Error\nTraceback (most recent call last):\n  File \"/opt/flocker/lib/python2.
Aug 15 08:58:11 server1 flocker-dataset-agent[29869]: {"task_uuid": "1a59b3b6-386d-4c61-a938-ba687b38f8a0", "error": false, "timestamp": 1471265891.587703, "message": "Log opened.", "message_type": "twisted:log", "task_level": [1]}
Aug 15 08:58:10 server1 systemd[1]: Starting Flocker Dataset Agent...
Aug 15 08:58:10 server1 systemd[1]: Started Flocker Dataset Agent.

Tested PureStorage driver manually:

[root@server1 purestorage-flocker-driver]# python
Python 3.4.2 (default, Apr 19 2016, 08:30:31) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import purestorage
>>> fa=purestorage.FlashArray('192.168.128.157',api_token='50cac451-fefe-b635-9692-5870aada9c49')
/opt/mesosphere/packages/python--e3169ded66609d3cb4055a3f9f8f0b1113a557a6/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:821: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
/opt/mesosphere/packages/python--e3169ded66609d3cb4055a3f9f8f0b1113a557a6/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:821: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
>>> fa.get()
/opt/mesosphere/packages/python--e3169ded66609d3cb4055a3f9f8f0b1113a557a6/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:821: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
{'revision': '201605132143+b186ed4', 'id': '7e4d75d0-ffc1-4e7c-880a-c044849e1993', 'version': '4.6.8', 'array_name': 'PURTPC0027'}
>>> 

As you can see above, connectivity is there. We were able to connect, authenticate using a token and were able to retrieve PureStorage basic info (i.e., version, name, etc)..

Verified the sanity of /etc/flocker/node.crt:

[root@server1 ~]# openssl x509 -text -in /etc/flocker/node.crt 
Certificate:
    Data:
        Version: 1 (0x0)
        Serial Number:
            fb:69:42:2e:a2:2e:52:1d:28:b9:59:d7:1d:06:21:13
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: OU=06dc4747-5c8d-4878-8943-b7d34cbf08c4, CN=flocker
        Validity
            Not Before: Jun 29 15:15:36 2016 GMT
            Not After : Jun 24 15:15:36 2036 GMT
        Subject: OU=06dc4747-5c8d-4878-8943-b7d34cbf08c4, CN=node-f810470c-7140-4017-b61d-4d9d326e4a3f
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (4096 bit)
                Modulus:
                    00:98:f7:e4:d0:4d:84:a8:43:a7:97:f4:95:09:19:
                    6f:d0:71:0b:76:56:3e:ca:14:86:3b:6f:03:be:ad:
                    4a:fb:1a:54:5c:51:e7:80:67:3d:e1:1a:92:8a:f5:
                    13:87:b1:e5:bd:16:72:de:53:85:31:f7:d3:c5:63:
                    dd:51:90:bb:03:68:b4:9a:ed:bc:82:44:18:6c:4a:
                    9d:21:32:c0:d1:bf:51:84:81:d4:ba:d6:ee:8d:6a:
                    f2:07:f6:8c:ac:87:61:6e:92:8e:2f:62:6a:5a:18:
                    38:ed:40:14:7f:87:e7:f0:cf:af:af:a7:94:54:b5:
                    83:c8:4c:38:6f:72:27:13:ee:50:53:1d:f2:04:0d:
                    70:d3:7f:a0:06:53:19:b9:dd:d7:85:4b:56:18:c7:
                    f9:29:2a:23:7e:0a:fb:9f:8c:f0:4a:b9:fe:c8:1f:
                    c6:97:3c:d9:a4:19:3a:5b:95:4b:87:d8:2a:71:55:
                    01:92:f3:ed:85:b9:12:8b:b1:4c:ab:14:51:1f:48:
                    47:bf:b2:36:ca:fd:4a:64:12:2f:c5:eb:bb:49:9f:
                    12:03:84:d4:02:dc:90:13:6a:c7:f8:7e:ff:03:2c:
                    16:c8:06:3d:a0:6a:59:e2:00:7b:1a:07:9f:9a:b8:
                    12:95:5a:a2:28:ce:84:2d:d7:e8:03:ae:c6:9b:5c:
                    60:c4:2c:c6:8b:31:7c:e7:02:aa:9d:85:1e:f5:b0:
                    64:de:d9:78:29:f9:35:d2:db:64:28:01:6a:c4:7e:
                    21:63:7d:09:0d:44:50:02:7c:23:ff:32:23:4e:5d:
                    51:be:28:05:b2:8c:7c:26:d0:01:9e:4f:ca:bc:69:
                    c1:a0:31:43:75:1a:f8:1e:7b:b7:f8:c0:88:2d:9b:
                    2f:09:00:8f:9e:95:42:39:a6:21:3e:cf:f8:0a:27:
                    f9:85:0d:fd:9a:14:3c:6e:2e:35:57:bc:b5:fa:ec:
                    06:04:c3:cb:5a:e4:a1:cf:ca:05:d8:67:5a:70:27:
                    39:0a:6d:9a:70:67:6a:a7:50:a2:25:01:d2:85:d5:
                    40:ce:12:29:75:d2:98:c1:88:14:56:1e:6a:a6:26:
                    f8:a2:3a:cc:36:57:e1:75:d5:00:48:cf:a4:bb:16:
                    41:59:a6:39:79:0d:52:63:cf:9f:6a:cc:bf:fd:90:
                    56:1c:de:eb:55:e8:a4:ed:62:4d:af:08:97:2b:a9:
                    cc:14:53:02:51:84:8a:bb:c7:8b:cb:c9:32:b2:cc:
                    7c:5d:8a:83:0c:97:06:28:bf:9f:c7:62:97:7b:e1:
                    c5:9f:24:a6:c8:73:f4:a5:34:2a:3b:7c:a2:b1:6b:
                    af:15:5a:99:72:27:19:49:b8:85:15:c8:39:48:72:
                    ba:05:1b
                Exponent: 65537 (0x10001)
    Signature Algorithm: sha256WithRSAEncryption
         9d:bc:37:b9:43:cd:89:ac:ad:50:c0:4e:83:03:26:0e:ce:42:
         ed:1f:1f:d5:3d:06:63:b3:6c:d4:d0:7d:f8:26:99:32:11:4f:
         ce:27:9f:ac:80:06:90:71:72:ef:7e:5f:a8:0a:86:ed:4d:65:
         d5:4f:55:5a:26:eb:94:27:0b:23:62:39:4f:0a:b3:07:a4:c8:
         dd:7c:b6:27:77:62:8c:f7:28:61:c0:56:bc:9c:b2:5e:08:36:
         78:52:34:a5:9f:25:54:e0:0d:0c:76:5b:27:a9:fc:5c:72:82:
         6d:82:65:11:fd:43:fa:29:d5:be:85:37:37:1b:4e:87:db:f5:
         d0:5e:92:48:ba:c0:ef:8f:b1:d4:04:11:00:53:4d:b4:e8:74:
         41:2e:6a:49:8a:3b:6a:b9:74:98:96:1a:74:f6:2f:de:5c:7c:
         98:4b:7a:3a:1a:55:98:99:23:d8:8e:c7:1f:32:c1:10:4c:4b:
         34:01:9c:bc:44:8b:eb:59:0e:c7:1a:0f:17:81:35:42:38:42:
         72:4a:06:d9:f4:9c:94:9b:66:f9:fb:ae:04:b2:32:cc:3c:d6:
         48:06:5b:c1:83:f6:ea:36:66:73:fc:1b:d0:46:91:8a:bf:12:
         d5:0e:71:6a:e9:18:5e:37:45:97:d5:d3:8d:3d:28:33:ed:83:
         9e:ca:79:d0:b2:b4:98:a7:ab:f9:da:48:b9:98:85:f7:5b:97:
         09:af:08:04:ef:f9:33:fb:31:d1:b5:8f:e4:69:d8:a5:12:53:
         92:cd:4f:17:d1:2d:ce:c7:c3:14:db:ac:dc:6c:4f:4c:79:2b:
         1b:ed:a0:c9:40:29:bd:93:48:3e:2d:96:4b:ab:6c:09:fe:b3:
         08:cd:4e:87:01:72:fb:d4:db:85:ed:5c:60:83:20:c2:c1:0f:
         c4:ce:fb:bd:aa:80:bc:eb:49:7d:30:ee:49:f1:11:7c:6b:8d:
         5c:aa:b0:9f:5d:ad:12:fa:fa:7f:aa:00:88:52:a9:a7:41:fc:
         79:5d:9f:2e:12:81:95:07:f3:64:e5:b7:22:49:95:42:4e:50:
         03:72:c4:ae:11:cf:ae:99:b5:47:5a:2c:a0:09:1e:98:0f:c7:
         a0:8a:64:e3:4b:e0:49:70:db:6b:4e:bd:e3:14:3a:ca:1f:c7:
         59:29:bc:28:41:fd:45:ea:be:fc:ed:c5:2e:ff:4a:57:ec:59:
         42:44:08:33:18:50:fc:78:00:99:a7:50:2a:73:c6:3a:1c:63:
         08:ce:53:45:0d:97:5a:50:08:ed:07:7f:90:35:56:a0:c4:c1:
         5b:28:96:d0:1c:7a:7a:54:b6:98:d6:e4:ed:08:a7:94:91:a7:
         86:e3:6f:36:e7:7f:6b:99
-----BEGIN CERTIFICATE-----
MIIFLDCCAxQCEQD7aUIuoi5SHSi5WdcdBiETMA0GCSqGSIb3DQEBCwUAMEExLTAr
BgNVBAsMJDA2ZGM0NzQ3LTVjOGQtNDg3OC04OTQzLWI3ZDM0Y2JmMDhjNDEQMA4G
A1UEAwwHZmxvY2tlcjAiGA8yMDE2MDYyOTE1MTUzNloYDzIwMzYwNjI0MTUxNTM2
WjBjMS0wKwYDVQQLDCQwNmRjNDc0Ny01YzhkLTQ4NzgtODk0My1iN2QzNGNiZjA4
YzQxMjAwBgNVBAMMKW5vZGUtZjgxMDQ3MGMtNzE0MC00MDE3LWI2MWQtNGQ5ZDMy
NmU0YTNmMIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAmPfk0E2EqEOn
l/SVCRlv0HELdlY+yhSGO28Dvq1K+xpUXFHngGc94RqSivUTh7HlvRZy3lOFMffT
xWPdUZC7A2i0mu28gkQYbEqdITLA0b9RhIHUutbujWryB/aMrIdhbpKOL2JqWhg4
7UAUf4fn8M+vr6eUVLWDyEw4b3InE+5QUx3yBA1w03+gBlMZud3XhUtWGMf5KSoj
fgr7n4zwSrn+yB/GlzzZpBk6W5VLh9gqcVUBkvPthbkSi7FMqxRRH0hHv7I2yv1K
ZBIvxeu7SZ8SA4TUAtyQE2rH+H7/AywWyAY9oGpZ4gB7GgefmrgSlVqiKM6ELdfo
A67Gm1xgxCzGizF85wKqnYUe9bBk3tl4Kfk10ttkKAFqxH4hY30JDURQAnwj/zIj
Tl1RvigFsox8JtABnk/KvGnBoDFDdRr4Hnu3+MCILZsvCQCPnpVCOaYhPs/4Cif5
hQ39mhQ8bi41V7y1+uwGBMPLWuShz8oF2GdacCc5Cm2acGdqp1CiJQHShdVAzhIp
ddKYwYgUVh5qpib4ojrMNlfhddUASM+kuxZBWaY5eQ1SY8+fasy//ZBWHN7rVeik
7WJNrwiXK6nMFFMCUYSKu8eLy8kyssx8XYqDDJcGKL+fx2KXe+HFnySmyHP0pTQq
O3yisWuvFVqZcicZSbiFFcg5SHK6BRsCAwEAATANBgkqhkiG9w0BAQsFAAOCAgEA
nbw3uUPNiaytUMBOgwMmDs5C7R8f1T0GY7Ns1NB9+CaZMhFPziefrIAGkHFy735f
qAqG7U1l1U9VWibrlCcLI2I5TwqzB6TI3Xy2J3dijPcoYcBWvJyyXgg2eFI0pZ8l
VOANDHZbJ6n8XHKCbYJlEf1D+inVvoU3NxtOh9v10F6SSLrA74+x1AQRAFNNtOh0
QS5qSYo7arl0mJYadPYv3lx8mEt6OhpVmJkj2I7HHzLBEExLNAGcvESL61kOxxoP
F4E1QjhCckoG2fSclJtm+fuuBLIyzDzWSAZbwYP26jZmc/wb0EaRir8S1Q5xaukY
XjdFl9XTjT0oM+2Dnsp50LK0mKer+dpIuZiF91uXCa8IBO/5M/sx0bWP5GnYpRJT
ks1PF9EtzsfDFNus3GxPTHkrG+2gyUApvZNIPi2WS6tsCf6zCM1OhwFy+9Tbhe1c
YIMgwsEPxM77vaqAvOtJfTDuSfERfGuNXKqwn12tEvr6f6oAiFKpp0H8eV2fLhKB
lQfzZOW3IkmVQk5QA3LErhHPrpm1R1osoAkemA/HoIpk40vgSXDba0694xQ6yh/H
WSm8KEH9Req+/O3FLv9KV+xZQkQIMxhQ/HgAmadQKnPGOhxjCM5TRQ2XWlAI7Qd/
kDVWoMTBWyiW0Bx6elS2mNbk7QinlJGnhuNvNud/a5k=
-----END CERTIFICATE-----
[root@server1 ~]# 

Tried to run flocker-diagnostics, but it exited:

[root@server1 ~]# flocker-diagnostics 
Traceback (most recent call last):
  File "/usr/sbin/flocker-diagnostics", line 7, in <module>
    from flocker.node.script import flocker_diagnostics_main
  File "/opt/flocker/lib/python2.7/site-packages/flocker/node/__init__.py", line 20, in <module>
    from .script import DeployerType
  File "/opt/flocker/lib/python2.7/site-packages/flocker/node/script.py", line 16, in <module>
    import yaml
  File "/opt/mesosphere/lib/python3.4/site-packages/yaml/__init__.py", line 284
    class YAMLObject(metaclass=YAMLObjectMetaclass):
                              ^
SyntaxError: invalid syntax
[root@server1 ~]# 

I am trying at least configure one node correctly. Once configured and visible from flockerctl command, I will proceed with the rest of the nodes (i.e., using node 1 as a template). Any pointers or suggestions ?

Thanks so much!!

wallnerryan commented 8 years ago

@akamalov thanks, strange, testing this out myself today.

wallnerryan commented 8 years ago

@akamalov i got this working without issues.

/usr/local/bin/flockerctl --control-service ip-10-167-144-195 --user plugin list-nodes
SERVER     ADDRESS     
5bf2fa29   10.63.55.92 
   34 mkdir /etc/flocker
   35  cd /etc/flocker/
   36  flocker-ca initialize my-cluster
   37  ls
   38  flocker-ca create-control-certificate ip-10-167-144-195
   39  ls
   40  mv control-ip-10-167-144-195.crt control-service.crt
   41  mv control-ip-10-167-144-195.key control-service.key
   42  ls
   43  chmod 0600 control-service.*
   44  chmod 0700 /etc/flocker
   45  flocker-ca create-node-certificate
   46  ls
   47  mv 5bf2fa29-2cc0-402b-8e4b-981a706d43cf.crt node.crt
   48  mv 5bf2fa29-2cc0-402b-8e4b-981a706d43cf.key node.key
   49  flocker-ca create-api-certificate plugin
   50  ls
   52  cd /etc/flocker/
   55  scp  node.* ec2-user@ip-10-63-55-92:/home/ec2-user/
   56  scp  cluster.crt* ec2-user@ip-10-63-55-92:/home/ec2-user/
   57  scp  plugin.* ec2-user@ip-10-63-55-92:/home/ec2-user/
   58  systemctl enable flocker-control
   59  systemctl start flocker-control
   79  /usr/local/bin/flockerctl --control-service ip-10-167-144-195 --user plugin list-nodes

Let's compare envs.

yum list installed | grep ssl
openssl.x86_64                   1:1.0.1e-42.el7_1.9        @anaconda/7.2       
openssl-libs.x86_64              1:1.0.1e-42.el7_1.9        @anaconda/7.2       
cat /etc/os-release 
NAME="Red Hat Enterprise Linux Server"
VERSION="7.2 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="7.2"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.2 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.2:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.2
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.2"
akamalov commented 8 years ago

Here you go:

SSL:

openssl.x86_64                    1:1.0.1e-51.el7_2.5      @rhel-x86_64-server-7
openssl-devel.x86_64              1:1.0.1e-51.el7_2.5      @rhel-x86_64-server-7
openssl-libs.x86_64               1:1.0.1e-51.el7_2.5      @rhel-x86_64-server-7

OS:

NAME="Red Hat Enterprise Linux Server"
VERSION="7.2 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="7.2"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.2 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.2:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.2
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.2"
wallnerryan commented 8 years ago
$ rpm -Uvh openssl-libs-1.0.1e-51.el7_2.5.x86_64.rpm openssl-1.0.1e-51.el7_2.5.x86_64.rpm 
warning: openssl-libs-1.0.1e-51.el7_2.5.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID 192a7d7d: NOKEY
Preparing...                          ################################# [100%]
Updating / installing...
   1:openssl-libs-1:1.0.1e-51.el7_2.5 ################################# [ 25%]
   2:openssl-1:1.0.1e-51.el7_2.5      ################################# [ 50%]
Cleaning up / removing...
   3:openssl-1:1.0.1e-42.el7_1.9      ################################# [ 75%]
   4:openssl-libs-1:1.0.1e-42.el7_1.9 ################################# [100%]

installed the same versions. I can't seem to reproduce your issues..

akamalov commented 8 years ago

Ok, then all of my systems have been possessed with demons...

wallnerryan commented 8 years ago

@akamalov i just noticed your agent.yml uses 4523

my agent.yml and control service used 4524

control-service:
  hostname: ip-10-167-144-195
  port: 4524
wallnerryan commented 8 years ago

Ok, then all of my systems have been possessed with demons...

hahaha, try the above port change in your agent.yml

UPDATE: if i change it to 4523 i can reproduce your issues. Looks like a misconfig, i should have noticed earlier :|

akamalov commented 8 years ago

Wow!! that did the trick!!

[root@mmaster1 new.cert]#  /usr/local/bin/flockerctl --control-service 192.168.120.156 --user user list-nodes
SERVER     ADDRESS       
1b8bd056   192.168.120.162 
4d35f998   192.168.120.163 
06978551   192.168.120.161 
701fdaad   192.168.120.164 
fb2b98cd   192.168.120.165 

[root@mmaster1 new.cert]# 
akamalov commented 8 years ago

@wallnerryan ...just wondering what will happen to CluserHQ if...you...quit ?

wallnerryan commented 8 years ago

:sadface: on my part, our logs don'y say what port its trying and I totally did not pay attention to that

akamalov commented 8 years ago

@wallnerryan Thank you very, very much!!!

wallnerryan commented 8 years ago

@wallnerryan ...just wondering what will happen to CluserHQ if...you...quit ?

hopefully we won't have to find out :)

wallnerryan commented 8 years ago

closing. open any new issues you might have, hopefully can resolve it quicker!