openshift / openshift-ansible

Install and config an OpenShift 3.x cluster
https://try.openshift.com
Apache License 2.0
2.18k stars 2.31k forks source link

Cannot create cluster on aws with bin/cluster #989

Closed wiza closed 7 years ago

wiza commented 8 years ago

Older version (month ago or so) worked, running on CentOS 7.1, this is the log:

$ bin/cluster -vvvvv create aws test RUN [time ansible-playbook -vvvvv -i inventory/aws/hosts -e 'num_masters=1 num_nodes=2 cluster_id=test num_etcd=0 num_infra=1 deployment_type=origin' playbooks/aws/openshift-cluster/launch.yml]

PLAY [Launch instance(s)] ***** localhost: not importing file: /home/centos/openshift/openshift-ansible/playbooks/aws/openshift-cluster/vars.origin.test.yml

TASK: [fail ] ***** skipping: [localhost]

TASK: [set_fact k8s_type="etcd"] ****** ok: [localhost] => {"ansible_facts": {"k8s_type": "etcd"}}

TASK: [Generate etcd instance names(s)] *** skipping: [localhost]

TASK: [set_fact ] ***** ok: [localhost] => {"ansible_facts": {"etcd_names": []}}

TASK: [set_fact ] ***** ok: [localhost] => {"ansible_facts": {"created_by": "centos", "docker_vol_ephemeral": "False", "env": "test", "env_host_type": "test-openshift-etcd", "host_type": "etcd", "sub_host_type": "default"}}

TASK: [set_fact ] ***** ok: [localhost] => {"ansible_facts": {"ec2_region": "us-east-1"}}

TASK: [set_fact ] ***** ok: [localhost] => {"ansible_facts": {"ec2_image_name": ""}}

TASK: [set_fact ] ***** ok: [localhost] => {"ansible_facts": {"ec2_image": "ami-96a818fe"}}

TASK: [set_fact ] ***** ok: [localhost] => {"ansible_facts": {"ec2_keypair": "libra"}}

TASK: [set_fact ] ***** ok: [localhost] => {"ansible_facts": {"ec2_vpc_subnet": ""}}

TASK: [set_fact ] ***** ok: [localhost] => {"ansible_facts": {"ec2_assign_public_ip": ""}}

TASK: [set_fact ] ***** skipping: [localhost]

TASK: [set_fact ] ***** ok: [localhost] => {"ansible_facts": {"ec2_instance_type": "", "ec2_security_groups": ["public"]}}

TASK: [set_fact ] ***** skipping: [localhost]

TASK: [set_fact ] ***** skipping: [localhost]

TASK: [set_fact ] ***** skipping: [localhost]

TASK: [set_fact ] ***** skipping: [localhost]

TASK: [Find amis for deployment_type] *****

REMOTE_MODULE ec2_ami_find ami_id=ami-96a818fe region=us-east-1 EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1448490369.25-89169338951004 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1448490369.25-89169338951004 && echo $HOME/.ansible/tmp/ansible-tmp-1448490369.25-89169338951004'] PUT /tmp/tmpq3OKoZ TO /home/centos/.ansible/tmp/ansible-tmp-1448490369.25-89169338951004/ec2_ami_find EXEC ['/bin/sh', '-c', u'LANG=C LC_CTYPE=C /usr/bin/env python2 /home/centos/.ansible/tmp/ansible-tmp-1448490369.25-89169338951004/ec2_ami_find; rm -rf /home/centos/.ansible/tmp/ansible-tmp-1448490369.25-89169338951004/ >/dev/null 2>&1'] ok: [localhost] => {"changed": false, "results": [{"ami_id": "ami-96a818fe", "architecture": "x86_64", "description": "CentOS 7 x86_64 (2014_09_29) EBS HVM", "is_public": false, "name": "CentOS 7 x86_64 (2014_09_29) EBS HVM-b7ee8a69-ee97-4a49-9e68-afaee216db2e-ami-d2a117ba.2", "owner_id": "679593333241", "platform": null, "root_device_name": "/dev/sda1", "root_device_type": "ebs", "state": "available", "tags": {}, "virtualization_type": "hvm"}]} TASK: [fail msg="Could not find requested ami"] ******************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"latest_ami": "ami-96a818fe", "volume_defs": {"etcd": {"etcd": {"device_type": "gp2", "iops": "500", "volume_size": "32"}, "root": {"device_type": "gp2", "iops": "500", "volume_size": "25"}}, "master": {"docker": {"device_type": "gp2", "iops": "500", "volume_size": "10"}, "root": {"device_type": "gp2", "iops": "500", "volume_size": "25"}}, "node": {"docker": {"device_type": "gp2", "iops": "500", "volume_size": "32"}, "root": {"device_type": "gp2", "iops": "500", "volume_size": "85"}}}}} TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"volumes": [{"delete_on_termination": true, "device_name": "/dev/sda1", "device_type": "gp2", "volume_size": "25"}, {"delete_on_termination": true, "device_name": "/dev/xvdb", "device_type": "gp2", "volume_size": "32"}]}} TASK: [Launch instance(s)] ***************************************************\* REMOTE_MODULE ec2 image=ami-96a818fe keypair=libra state=present instance_type='' user_data='#cloud-config cloud_config_modules: - disk_setup - mounts mounts: - [ xvdb, /var/lib/etcd, xfs, "defaults" ] disk_setup: xvdb: table_type: mbr layout: True fs_setup: - label: etcd_storage filesystem: xfs device: /dev/xvdb partition: auto - path: /etc/sudoers.d/99-centos-cloud-init-requiretty permissions: 440 content: | Defaults:centos !requiretty ' region=us-east-1 count=0 EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1448490370.25-137856401355345 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1448490370.25-137856401355345 && echo $HOME/.ansible/tmp/ansible-tmp-1448490370.25-137856401355345'] PUT /tmp/tmpSYHp4i TO /home/centos/.ansible/tmp/ansible-tmp-1448490370.25-137856401355345/ec2 EXEC ['/bin/sh', '-c', u'LANG=C LC_CTYPE=C /usr/bin/env python2 /home/centos/.ansible/tmp/ansible-tmp-1448490370.25-137856401355345/ec2; rm -rf /home/centos/.ansible/tmp/ansible-tmp-1448490370.25-137856401355345/ >/dev/null 2>&1'] ok: [localhost] => {"changed": false, "instance_ids": [], "instances": [], "tagged_instances": []} TASK: [Add Name tag to instances] ********************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"instance_groups": "tag_created-by_centos, tag_env_test, tag_host-type_etcd, tag_env-host-type_test-openshift-etcd, tag_sub-host-type_default"}} TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"node_label": {"region": "us-east-1", "type": "etcd"}}} TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"logrotate": [{"name": "syslog", "options": ["daily", "rotate 7", "compress", "sharedscripts", "missingok"], "path": "/var/log/cron \n/var/log/maillog \n/var/log/messages \n/var/log/secure \n/var/log/spooler \n", "scripts": {"postrotate": "/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true"}}]}} TASK: [Add new instances groups and variables] *******************************\* skipping: [localhost] TASK: [Add new instances to nodes_to_add group if needed] ********************\* skipping: [localhost] TASK: [Wait for ssh] *********************************************************\* skipping: [localhost] TASK: [Wait for user setup] **************************************************\* skipping: [localhost] TASK: [set_fact k8s_type="master"] *******************************************\* ok: [localhost] => {"ansible_facts": {"k8s_type": "master"}} TASK: [Generate master instance names(s)] ************************************\* ok: [localhost] => (item=1) => {"ansible_facts": {"scratch_name": "test-master-4605c"}, "item": "1"} TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"master_names": ["test-master-4605c"]}} TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"created_by": "centos", "docker_vol_ephemeral": "False", "env": "test", "env_host_type": "test-openshift-master", "host_type": "master", "sub_host_type": "default"}} TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"ec2_instance_type": "", "ec2_security_groups": ["public"]}} TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* skipping: [localhost] TASK: [Find amis for deployment_type] ****************************************\* REMOTE_MODULE ec2_ami_find ami_id=ami-96a818fe region=us-east-1 EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1448490371.0-137364645760956 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1448490371.0-137364645760956 && echo $HOME/.ansible/tmp/ansible-tmp-1448490371.0-137364645760956'] PUT /tmp/tmpH1qQZH TO /home/centos/.ansible/tmp/ansible-tmp-1448490371.0-137364645760956/ec2_ami_find EXEC ['/bin/sh', '-c', u'LANG=C LC_CTYPE=C /usr/bin/env python2 /home/centos/.ansible/tmp/ansible-tmp-1448490371.0-137364645760956/ec2_ami_find; rm -rf /home/centos/.ansible/tmp/ansible-tmp-1448490371.0-137364645760956/ >/dev/null 2>&1'] ok: [localhost] => {"changed": false, "results": [{"ami_id": "ami-96a818fe", "architecture": "x86_64", "description": "CentOS 7 x86_64 (2014_09_29) EBS HVM", "is_public": false, "name": "CentOS 7 x86_64 (2014_09_29) EBS HVM-b7ee8a69-ee97-4a49-9e68-afaee216db2e-ami-d2a117ba.2", "owner_id": "679593333241", "platform": null, "root_device_name": "/dev/sda1", "root_device_type": "ebs", "state": "available", "tags": {}, "virtualization_type": "hvm"}]} TASK: [fail msg="Could not find requested ami"] ******************************\* skipping: [localhost] TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"latest_ami": "ami-96a818fe", "volume_defs": {"etcd": {"etcd": {"device_type": "gp2", "iops": "500", "volume_size": "32"}, "root": {"device_type": "gp2", "iops": "500", "volume_size": "25"}}, "master": {"docker": {"device_type": "gp2", "iops": "500", "volume_size": "10"}, "root": {"device_type": "gp2", "iops": "500", "volume_size": "25"}}, "node": {"docker": {"device_type": "gp2", "iops": "500", "volume_size": "32"}, "root": {"device_type": "gp2", "iops": "500", "volume_size": "85"}}}}} TASK: [set_fact ] ************************************************************\* ok: [localhost] => {"ansible_facts": {"volumes": [{"delete_on_termination": true, "device_name": "/dev/sda1", "device_type": "gp2", "volume_size": "25"}, {"delete_on_termination": true, "device_name": "/dev/xvdb", "device_type": "gp2", "volume_size": "10"}]}} TASK: [Launch instance(s)] ***************************************************\* REMOTE_MODULE ec2 image=ami-96a818fe keypair=libra state=present instance_type='' user_data='#cloud-config mounts: - [ xvdb ] - [ ephemeral0 ] write_files: - content: | DEVS=/dev/xvdb VG=docker_vg path: /etc/sysconfig/docker-storage-setup owner: root:root permissions: '"'"'0644'"'"' - path: /etc/sudoers.d/99-centos-cloud-init-requiretty permissions: 440 content: | Defaults:centos !requiretty ' region=us-east-1 count=1 EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1448490371.6-123589846258557 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1448490371.6-123589846258557 && echo $HOME/.ansible/tmp/ansible-tmp-1448490371.6-123589846258557'] PUT /tmp/tmpqGZssW TO /home/centos/.ansible/tmp/ansible-tmp-1448490371.6-123589846258557/ec2 EXEC ['/bin/sh', '-c', u'LANG=C LC_CTYPE=C /usr/bin/env python2 /home/centos/.ansible/tmp/ansible-tmp-1448490371.6-123589846258557/ec2; rm -rf /home/centos/.ansible/tmp/ansible-tmp-1448490371.6-123589846258557/ >/dev/null 2>&1'] failed: [localhost] => {"failed": true} msg: Instance creation failed => InvalidParameterCombination: Non-Windows instances with a virtualization type of 'hvm' are currently not supported for this instance type. FATAL: all hosts have already failed -- aborting PLAY RECAP *******************************************************************\* to retry, use: --limit @/home/centos/launch.retry localhost : ok=25 changed=0 unreachable=0 failed=1 real 0m4.507s user 0m2.050s sys 0m0.237s Traceback (most recent call last): File "bin/cluster", line 366, in args.func(args) File "bin/cluster", line 68, in create self.action(args, inventory, env, playbook) File "bin/cluster", line 233, in action .format(args.action, exc)) ActionFailed: ACTION [create] failed: Command 'time ansible-playbook -vvvvv -i inventory/aws/hosts -e 'num_masters=1 num_nodes=2 cluster_id=test num_etcd=0 num_infra=1 deployment_type=origin' playbooks/aws/openshift-cluster/launch.yml' returned non-zero exit status 2
benbarclay commented 8 years ago

Also seeing this behaviour where AWS instances cannot be created with the error: "msg: Instance creation failed => InvalidParameterCombination: Non-Windows instances with a virtualization type of 'hvm' are currently not supported for this instance type."

detiber commented 8 years ago

Try exporting the following env variable:

export ec2_vpc_subnet=<valid VPC subnet id>

I believe the issue is related to ec2 limiting access to certain instance sizes in classic mode.

benbarclay commented 8 years ago

I already export the subnet (as we have quite a few, and need to select the correct one)

detiber commented 8 years ago

@benbarclay are you overriding any of the default instance types through env variables as well?

ec2_instance_type will set a default override ec2_master_instance_type will override master hosts only ec2_node_instance_type will override node hosts only ec2_infra_instance_type will override infra node hosts only ec2_etcd_instance_type will override ec2 hosts only

It could be an issue with m4.large instances in your region/az.

benbarclay commented 8 years ago

ec2_region='ap-southeast-2' \ ec2_keypair='REDACTED' \ ec2_image='ami-d38dc6e9' \ ec2_vpc_subnet="REDACTED" \ ec2_assign_public_ip='true' \ bin/cluster -v create aws --deployment-type=origin openshift

It was working for me a few days ago, then I started getting the above error after a git pull brought me up to date with master.

benbarclay commented 8 years ago

Having said that, I checked out out the previous version I was on and can't deploy from it either now.

detiber commented 8 years ago

@benbarclay have you tried reverting to your previous checkout? I saw the same issue a couple of weeks ago, and resolved it by setting ec2_vpc_subnet, which then provisioned the hosts inside of a vpc instead of ec2 classic, which is the default for my account.

Do you see the same error when attempting to spin up a m4.large instance using the same image in the same vpc subnet within the same region? The error indicates that ec2 isn't allowing that combination for your account at this time.

Switching one or more of the following: region, vpc_subnet, instance_type should clear up the issue.

wiza commented 8 years ago

I changed the region (was deploying to eu-west-1 before) so I unsetted and tried to ran with defaults. It has worked before. Should I remove all vpcs, keypairs, etc. These scripts have previously launched instances inside vpc by default?

detiber commented 8 years ago

@wiza it will use the default VPC if configured for accounts that don't have access to ec2 classic.

Have you tried using m3.large instances instead?

export ec2_instance_type=m3.large
wiza commented 8 years ago

@detiber no help on changing instance_type, also no use of specifying existing vpc subnet or changing region.

detiber commented 8 years ago

@wiza it looks like the error message I was looking at was the wrong one.

TASK: [fail msg="Could not find requested ami"]

This appears to tell me that the centos image we are referencing is no longer available.

Could you verify the ami id for the latest CentOS image?

wiza commented 8 years ago

ami-33734044 @ eu-west-1

wiza commented 8 years ago

Log has ami-96a818fe @ us-east-1, which are the defaults in the script. But it's the same error, what ever the variables...

detiber commented 8 years ago

I'll attempt to replicate today.

detiber commented 8 years ago

It looks like the env lookups we were doing to set the defaults for the instance types is the issue. I submitted https://github.com/openshift/openshift-ansible/pull/1003 to address it.

The big thing was that the env lookup plugin was returning an empty string for a non-existent environment variable which prevented the template from returning the right instance type, and was instead passing the empty string on as the instance type. I updated the use of the default filter to treat the values as booleans for the existence test and my testing shows that it is working as expected with that change.

tbielawa commented 7 years ago

This issue has been inactive for quite some time. Please update and reopen this issue if this is still a priority you would like to see action on.