openshift / openshift-ansible

Install and config an OpenShift 3.x cluster
https://try.openshift.com
Apache License 2.0
2.18k stars 2.31k forks source link

GlusterFS playbook refers to nodes by internal hostname #6832

Closed miminar closed 6 years ago

miminar commented 6 years ago

Description

Trying to deploy OCP 3.7 on amazon with glusterfs as a storage backend. I hit the following error:

TASK [openshift_storage_glusterfs : Label GlusterFS nodes] **********************************************************************************************************************************
changed: [ec2-54-237-234-66.compute-1.amazonaws.com] => (item=ec2-54-237-234-66.compute-1.amazonaws.com)
failed: [ec2-54-237-234-66.compute-1.amazonaws.com] (item=ec2-54-165-111-221.compute-1.amazonaws.com) => {"changed": false, "item": "ec2-54-165-111-221.compute-1.amazonaws.com", "msg": {"cmd": "/bin/oc label node ip-172-18-12-254.ec2.internal glusterfs=storage-host --overwrite", "results": {}, "returncode": 1, "stderr": "Error from server (NotFound): nodes \"ip-172-18-12-254.ec2.internal\" not found\n", "stdout": ""}}
failed: [ec2-54-237-234-66.compute-1.amazonaws.com] (item=ec2-54-173-169-202.compute-1.amazonaws.com) => {"changed": false, "item": "ec2-54-173-169-202.compute-1.amazonaws.com", "msg": {"cmd": "/bin/oc label node ip-172-18-7-227.ec2.internal glusterfs=storage-host --overwrite", "results": {}, "returncode": 1, "stderr": "Error from server (NotFound): nodes \"ip-172-18-7-227.ec
2.internal\" not found\n", "stdout": ""}}
        to retry, use: --limit @/home/ec2-user/sapvora/openshift-ansible/playbooks/byo/config.retry

As you can see in the inventory file below, I'm referring to the nodes with public name (e.g. ec2-54-237-234-66.compute-1.amazonaws.com). However, the glusterfs installation refers to them by the internal hostname which fails of course.

Version
$ git describe
openshift-ansible-3.7.22-1-9-g56970a0

# ansible version:
ansible 2.4.2.0
Steps To Reproduce

Using the following inventory file:

``` [OSEv3:children] masters nodes etcd glusterfs [OSEv3:vars] ansible_user=ec2-user ansible_become=yes debug_level=2 openshift_deployment_type=openshift-enterprise openshift_release=v3.7 openshift_clusterid="vora1" system_images_registry="registry.access.redhat.com" openshift_disable_check=disk_availability docker_version="1.12.6" openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}] openshift_master_htpasswd_users=********* openshift_cloudprovider_kind=aws openshift_cloudprovider_aws_access_key="{{ lookup('env','AWS_ACCESS_KEY_ID') }}" openshift_cloudprovider_aws_secret_key="{{ lookup('env','AWS_SECRET_ACCESS_KEY') }}" osm_use_cockpit=true openshift_master_cluster_hostname=ip-172-18-11-220.ec2.internal openshift_master_cluster_public_hostname=ec2-54-237-234-66.compute-1.amazonaws.com openshift_master_default_subdomain=apps.54.237.234.66.nip.io osm_default_node_selector='role=app' osn_storage_plugin_deps=['glusterfs'] openshift_hosted_router_selector='role=infra' openshift_hosted_router_replicas=1 openshift_hosted_registry_selector='region=infra' openshift_hosted_registry_replicas=1 openshift_hosted_registry_enforcequota=false openshift_metrics_install_metrics=true openshift_enable_service_catalog=true template_service_broker_install=true openshift_storage_glusterfs_namespace=default [glusterfs] ec2-54-237-234-66.compute-1.amazonaws.com glusterfs_ip=172.18.11.220 glusterfs_devices='[ "/dev/xvdc" ]' ec2-54-165-111-221.compute-1.amazonaws.com glusterfs_ip=172.18.12.254 glusterfs_devices='[ "/dev/xvdc" ]' ec2-54-173-169-202.compute-1.amazonaws.com glusterfs_ip=172.18.7.227 glusterfs_devices='[ "/dev/xvdc" ]' [masters] ec2-54-237-234-66.compute-1.amazonaws.com [etcd] ec2-54-237-234-66.compute-1.amazonaws.com [nodes] ec2-54-237-234-66.compute-1.amazonaws.com openshift_schedulable=True openshift_node_labels="{'role': 'infra', 'region': 'primary', 'zone': 'default'}" ec2-54-165-111-221.compute-1.amazonaws.com openshift_node_labels="{'role': 'app', 'region': 'primary', 'zone': 'default'}" ec2-54-173-169-202.compute-1.amazonaws.com openshift_node_labels="{'role': 'app', 'region': 'primary', 'zone': 'default'}" ```

Expected Results

Installation succeeds.

Observed Results

Seeing the error above.

Additional information

Running the hostname command on one of the nodes gives me:

[root@ip-172-18-11-220 etc]# hostname
ip-172-18-11-220.ec2.internal
miminar commented 6 years ago

Not sure what was the error. But restarting the installation worked. Using the internal hostname is not a problem apparently.

If I can reproduce again, I'll reopen.