ClusterHQ / flocker

Container data volume manager for your Dockerized application
https://clusterhq.com
Apache License 2.0
3.39k stars 290 forks source link

Flocker Volume creation command results in pending state. #2890

Closed akamalov closed 8 years ago

akamalov commented 8 years ago

Operating System:

NAME="Red Hat Enterprise Linux Server"
VERSION="7.2 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="7.2"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.2 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.2:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.2
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.2"

Flocker environment:

clusterhq-python-flocker-1.13.0-1.x86_64
clusterhq-flocker-docker-plugin-1.13.0-1.noarch
clusterhq-flocker-node-1.13.0-1.noarch

Flocker /etc/flocker/agent.yml file

"version": 1
"control-service": 
  "hostname": 192.168.120.156 
  "port": 4524
"dataset": 
    "backend": "purestorage_flasharray_flocker_driver"
    "pure_ip": 172.16.128.157 
    "pure_api_token": "50cac451-fefe-b635-9692-5870aada9c49"
    "pure_chap_host_user": "mslave1"
    "pure_chap_host_password": "XXXXX"

Problem:

Volume creation command results in pending state.

Step-by-step recreation:

Display agent nodes:

[root@mmaster1 new.cert]#  /usr/local/bin/flockerctl --control-service 192.168.120.156 --user user list-nodes
SERVER     ADDRESS       
1b8bd056   192.168.120.162 
4d35f998   192.168.120.163 
06978551   192.168.120.161 
701fdaad   192.168.120.164 
fb2b98cd   192.168.120.165 

[root@mmaster1 new.cert]# 

Create 5GB volume:

[root@mmaster1 new.cert]# flockerctl --control-service=192.168.120.156 create --node  1b8bd056 --size 5Gb --metadata "name=apples,size=small"
created dataset in configuration, manually poll state with 'flocker-volumes list' to see it show up.

[root@mmaster1 new.cert]#

List volumes and status:

[root@mmaster1 new.cert]# /usr/local/bin/flockerctl --control-service 192.168.120.156 list
DATASET                                SIZE    METADATA                 STATUS        SERVER                   
b5afba76-949c-44ac-876f-08330f6c5bee   5.00G   name=apples,size=small   pending ⌛   1b8bd056 (192.168.120.162) 

Currently Pure Flocker driver has been installed (https://github.com/PureStorage-OpenConnect/purestorage-flocker-driver).

Pure Flocker driver has been manually tested to have proper authentication:

root@server1 purestorage-flocker-driver]# python
Python 3.4.2 (default, Apr 19 2016, 08:30:31) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import purestorage
>>> fa=purestorage.FlashArray('172.16.128.157',api_token='50cac451-fefe-b635-9692-5870aada9c49')
/opt/mesosphere/packages/python--e3169ded66609d3cb4055a3f9f8f0b1113a557a6/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:821: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
/opt/mesosphere/packages/python--e3169ded66609d3cb4055a3f9f8f0b1113a557a6/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:821: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
>>> fa.get()
/opt/mesosphere/packages/python--e3169ded66609d3cb4055a3f9f8f0b1113a557a6/lib/python3.4/site-packages/requests/packages/urllib3/connectionpool.py:821: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
  InsecureRequestWarning)
{'revision': '201605132143+b186ed4', 'id': '7e4d75d0-ffc1-4e7c-880a-c044849e1993', 'version': '4.6.8', 'array_name': 'PURTPC0027'}
>>> 

Question(s):

  1. Where would flocker-dataset-agent log or is it possible to make it log it separately ?
  2. Are there (if any) troubleshooting steps to troubleshoot Flocker-to-storage problems, such as:
    • authentication
    • authorization
    • volume allocation
wallnerryan commented 8 years ago
  1. flocker-dataset-agent logs to journald. You can find debugging tools we help provide here. https://docs.clusterhq.com/en/latest/administering/debugging.html#debugging-flocker
  2. see above.

On a seperate note, I received an email of someone else using the pure driver having the same issue, Patrick, from pure seemed to have responded but worth opening a ticket on the pure driver issues as well and linking here if there are non-driver/flocker-specific issues. https://github.com/PureStorage-OpenConnect/purestorage-flocker-driver/issues

Original Message below.

Hi,

Apologies for the delay, I'm currently traveling so will have a little slow with responses for the next few days.

A quick look through the dataset-agent.log shows it is hitting an error in the device discovery, not sure yet why that might be happening. The error is coming from https://github.com/PureStorage-OpenConnect/purestorage-flocker-driver/blob/master/purestorage_flasharray_flocker_driver/purestorage_blockdevice.py#L669 which is raised when the driver is unable to find the block device on the system.

Can you verify that there is are online iscsi sessions and a valid multipath device on the system? The output of "iscsiadm -m session", "multipath -l", as well as "ls -l /dev/disk/by-uuid" and "ls -l /dev/disk/by-path" would be very helpful. Maybe even the syslog/messages for the system to make sure there are no errors with the multipathing. Please also verify that the Flash Array has a host w/ volume attachment on it.

If possible can you create an issue on https://github.com/PureStorage-OpenConnect/purestorage-flocker-driver/issues ? It is helpful to track these issues and speed up response time.

Thanks!

-Patrick

On Mon, Aug 22, 2016 at 3:00 PM, harpreet singh baath  wrote:
hello,

I am having a problem using pure storage driver with flocker. My environment is RHEL 7.2 I am currently using in two nodes 1 one for flocker-controller and the other acting as a node having flocker client and plugin installed. While checking the status of flocker cluster using flocker ctl command, the status shows in the pending state. I am having a problem while attaching the volume.
I have tested pure storage driver manually and it's’s working.
akamalov commented 8 years ago

Thanks so much @wallnerryan

akamalov commented 8 years ago

Created ticket with Pure: https://github.com/PureStorage-OpenConnect/purestorage-flocker-driver/issues/3

wallrj commented 8 years ago

I'll close this as it looks like it has been narrowed down to a problem with the PureStorage backend or the driver.