scality / metalk8s

An opinionated Kubernetes distribution with a focus on long-term on-prem deployments
Apache License 2.0
363 stars 45 forks source link

Getting undefined variable vg_name #408

Open rhugga opened 6 years ago

rhugga commented 6 years ago

Error:

TASK [setup_lvm_lv : LVM Setup: Create filesystem on each LVM LVs] *********************************************************************************************************************************************************************************************
Wednesday 26 September 2018  08:31:06 -0700 (0:00:40.614)       0:03:03.321 *** 
fatal: [st11p01if-ztds24083901.example.com]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'vg_name' is undefined\n\nThe error appears to have been in '~/devel/metalk8s/roles/setup_lvm_lv/tasks/main.yml': line 102, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: 'LVM Setup: Create filesystem on each LVM LVs'\n  ^ here\nThis one looks easy to fix.  It seems that there is a value started\nwith a quote, and the YAML parser is expecting to see the line ended\nwith the same kind of quote.  For instance:\n\n    when: \"ok\" in result.stdout\n\nCould be written as:\n\n   when: '\"ok\" in result.stdout'\n\nOr equivalently:\n\n   when: \"'ok' in result.stdout\"\n"}
fatal: [st14p01if-ztds13161301.example.com]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'vg_name' is undefined\n\nThe error appears to have been in '~/devel/metalk8s/roles/setup_lvm_lv/tasks/main.yml': line 102, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: 'LVM Setup: Create filesystem on each LVM LVs'\n  ^ here\nThis one looks easy to fix.  It seems that there is a value started\nwith a quote, and the YAML parser is expecting to see the line ended\nwith the same kind of quote.  For instance:\n\n    when: \"ok\" in result.stdout\n\nCould be written as:\n\n   when: '\"ok\" in result.stdout'\n\nOr equivalently:\n\n   when: \"'ok' in result.stdout\"\n"}
ballot-scality commented 6 years ago

Hello @rhugga

Could you please give us (sanitized if needed) :

You seem to have an undefined variable, which should be computed from metalk8s_lvm_vgs in the 1.0.0 release.

Thank you !

rhugga commented 6 years ago

Version: 1.0.0

Inventory file:

[kube-master]
st11p01if-ztds24294201.example.com 
st13p01if-ztds07221901.example.com 
st14p01if-ztds09351001.example.com 

[etcd]
st13p01if-ztds19214201.example.com
st13p01if-ztds19323701.example.com
st14p01if-ztds11084701.example.com

[kube-node]
st11p01if-ztds24083901.example.com
st14p01if-ztds13161301.example.com
st11p01if-ztds24301701.example.com
st13p01if-ztds07220901.example.com
st13p01if-ztds07221301.example.com

[k8s-cluster:children]
kube-node
kube-master

group_vars:

(ansible) crashkid:~/devel/metalk8s/inventory/st-cobbler $cat group_vars/kube-node.yml 

# For persistent storage / glusterfs
metalk8s_lvm_drives_vg_metalk8s: ['/dev/sdf', '/dev/sdg', '/dev/sdh' ]
rhugga commented 6 years ago

Any idea why this isn't working? I ran another ansible playbook of mine that creates raid 10 devices and it works fine. It also calls blkid.

We've looked at everything 5 times over.

Here is the first host's metalk8s_lvm_all_lvs variable:

TASK [setup_lvm_lv : debug] ************************************************************************************************************************************************************************************************************************************
Wednesday 26 September 2018  14:16:34 -0700 (0:00:40.823)       0:03:01.791 *** 
ok: [st11p01if-ztds24083901.example.com] => {
    "msg": {
        "/dev/vg_metalk8s/metalk8s_lv01": {
            "force": false, 
            "fs_opts": "-m 0", 
            "fstype": "ext4", 
            "host": "st11p01if-ztds24083901.example.com", 
            "labels": {
                "scality.com/metalk8s_fstype": "ext4", 
                "scality.com/metalk8s_node": "st11p01if-ztds24083901.example.com", 
                "scality.com/metalk8s_vg": "vg_metalk8s"
            }, 
            "lv_name": "metalk8s_lv01", 
            "mount_opts": "defaults,noatime", 
            "size": "52G", 
            "vg_prop": {
                "drives": [
                    "/dev/sdf", 
                    "/dev/sdg", 
                    "/dev/sdh"
                ], 
                "host_path": "/mnt/vg_metalk8s", 
                "pv_dict": {
                    "metalk8s_lv01": {
                        "size": "52G"
                    }, 
                    "metalk8s_lv02": {
                        "size": "5G"
                    }, 
                    "metalk8s_lv03": {
                        "size": "11G"
                    }
                }, 
                "storageclass": "local-lvm", 
                "vg_name": "vg_metalk8s"
            }
        }, 
        "/dev/vg_metalk8s/metalk8s_lv02": {
            "force": false, 
            "fs_opts": "-m 0", 
            "fstype": "ext4", 
            "host": "st11p01if-ztds24083901.example.com", 
            "labels": {
                "scality.com/metalk8s_fstype": "ext4", 
                "scality.com/metalk8s_node": "st11p01if-ztds24083901.example.com", 
                "scality.com/metalk8s_vg": "vg_metalk8s"
            }, 
            "lv_name": "metalk8s_lv02", 
            "mount_opts": "defaults,noatime", 
            "size": "5G", 
            "vg_prop": {
                "drives": [
                    "/dev/sdf", 
                    "/dev/sdg", 
                    "/dev/sdh"
                ], 
                "host_path": "/mnt/vg_metalk8s", 
                "pv_dict": {
                    "metalk8s_lv01": {
                        "size": "52G"
                    }, 
                    "metalk8s_lv02": {
                        "size": "5G"
                    }, 
                    "metalk8s_lv03": {
                        "size": "11G"
                    }
                }, 
                "storageclass": "local-lvm", 
                "vg_name": "vg_metalk8s"
            }
        }, 
        "/dev/vg_metalk8s/metalk8s_lv03": {
            "force": false, 
            "fs_opts": "-m 0", 
            "fstype": "ext4", 
            "host": "st11p01if-ztds24083901.example.com", 
            "labels": {
                "scality.com/metalk8s_fstype": "ext4", 
                "scality.com/metalk8s_node": "st11p01if-ztds24083901.example.com", 
                "scality.com/metalk8s_vg": "vg_metalk8s"
            }, 
            "lv_name": "metalk8s_lv03", 
            "mount_opts": "defaults,noatime", 
            "size": "11G", 
            "vg_prop": {
                "drives": [
                    "/dev/sdf", 
                    "/dev/sdg", 
                    "/dev/sdh"
                ], 
                "host_path": "/mnt/vg_metalk8s", 
                "pv_dict": {
                    "metalk8s_lv01": {
                        "size": "52G"
                    }, 
                    "metalk8s_lv02": {
                        "size": "5G"
                    }, 
                    "metalk8s_lv03": {
                        "size": "11G"
                    }
                }, 
                "storageclass": "local-lvm", 
                "vg_name": "vg_metalk8s"
            }
        }
    }
}
rhugga commented 6 years ago

Here I am issuing one of the blkid commands that fails as the user ansible is running as:

-0-wcarson@st11p01if-ztds24083901:~ $ /sbin/blkid -s UUID -o value /dev/vg_metalk8s/metalk8s_lv01 aa113f44-1a1e-4686-bb8d-28615133348e

I believe this might be because blkid is in /sbin and not in my user's path. So I changed this:

# roles/setup_lvm_lv/tasks/main.yml
- name: 'Setup LVM: Get UUIDs of LVM LVs'
  command: '/sbin/blkid -s UUID -o value {{ item.key }}'
  check_mode: False
  changed_when: False
  register: metalk8s_lvm_lvs_uuids
  with_dict: '{{ metalk8s_lvm_all_lvs }}'

However, it appears the /sbin is getting stripped somewhere (can't find out where):

2018-09-26 11:43:19,203 p=5873 u=wcarson |  TASK [setup_lvm_lv : Setup LVM: Get UUIDs of LVM LVs] **********************************************************************************************************************************************************************************************************
2018-09-26 11:43:19,203 p=5873 u=wcarson |  Wednesday 26 September 2018  11:43:19 -0700 (0:00:00.268)       0:07:59.985 *** 
2018-09-26 11:43:40,715 p=5873 u=wcarson |  failed: [st11p01if-ztds24083901.example.com] (item={'value': {'force': False, 'lv_name': u'metalk8s_lv01', 'labels': {'scality.com/metalk8s_node': u'st11p01if-ztds24083901.example.com', 'scality.com/metalk8s_fstype': u'ext4', 'scality.com/metalk8s_vg': u'vg_metalk8s'}, 'fstype': u'ext4', 'fs_opts': u'-m 0', 'host': u'st11p01if-ztds24083901.example.com', 'mount_opts': u'defaults,noatime', 'vg_prop': {'host_path': u'/mnt/vg_metalk8s', 'pv_dict': {u'metalk8s_lv01': {u'size': u'52G'}, u'metalk8s_lv03': {u'size': u'11G'}, u'metalk8s_lv02': {u'size': u'5G'}}, 'drives': [u'/dev/sdf', u'/dev/sdg', u'/dev/sdh'], 'vg_name': u'vg_metalk8s', 'storageclass': u'local-lvm'}, u'size': u'52G'}, 'key': '/dev/vg_metalk8s/metalk8s_lv01'}) => {"changed": false, "cmd": "blkid -s UUID -o value /dev/vg_metalk8s/metalk8s_lv01", "item": {"key": "/dev/vg_metalk8s/metalk8s_lv01", "value": {"force": false, "fs_opts": "-m 0", "fstype": "ext4", "host": "st11p01if-ztds24083901.example.com", "labels": {"scality.com/metalk8s_fstype": "ext4", "scality.com/metalk8s_node": "st11p01if-ztds24083901.example.com", "scality.com/metalk8s_vg": "vg_metalk8s"}, "lv_name": "metalk8s_lv01", "mount_opts": "defaults,noatime", "size": "52G", "vg_prop": {"drives": ["/dev/sdf", "/dev/sdg", "/dev/sdh"], "host_path": "/mnt/vg_metalk8s", "pv_dict": {"metalk8s_lv01": {"size": "52G"}, "metalk8s_lv02": {"size": "5G"}, "metalk8s_lv03": {"size": "11G"}}, "storageclass": "local-lvm", "vg_name": "vg_metalk8s"}}}, "msg": "[Errno 2] No such file or directory", "rc": 2}

It also appears to be ignored 2 of the 3 disks I specified in the config for it to use for local persistent strorage.

NicolasT commented 6 years ago

Simply to rule it out: I assume you're running ansible-playbook with the -b flag, or have ansible_become set as vars on all hosts?

Other than that, nothing stands out to me, but I'm not the most knowledgeable when it comes to this part of the project :smiley: @ballot-scality can likely weigh in tomorrow (the team is based in Paris).

Meanwhile, I added some (very basic) CI test for multi-disk PVs.

rhugga commented 6 years ago

I'm using the -b flag, not sure what you mean by ansible_become. I haven't specified that anywhere.

NicolasT commented 6 years ago

Thanks for getting back. The -b flag tells Ansible to 'become' (root) for every task. Setting ansible_become to true as a var has the same effect, so no need to specify this when using -b.

rhugga commented 6 years ago

Do you happen to know why it appears to be stripping the /sbin from the front of the blkid command? I'm kinda new to ansible and I've never seen it do that before. I have a lot of playbooks that hardcode command paths, typically commands under /sbin and never had this issue.

I think this is a path thing and the "file not found" is complaining about not finding blkid, not the device.

rhugga commented 6 years ago

Yep it's a path issue. I just symlinked /sbin/blkid to /bin/blkid and it got past that roadblock.

NicolasT commented 6 years ago

Reopening this one for tracking, we should ensure /sbin is in $PATH when invoking blkid. Thanks for debugging and reporting!

rhugga commented 6 years ago

It probably works on Centos or Redhat, this might just be an OELism.