fusor / catasb

Ansible scripts to setup an 'oc cluster up' environment for testing the Service Catalog and Ansible Service Broker on EC2 or Local
47 stars 33 forks source link

Adding 'iptables -F' to local environment run to avoid network issues with pods unable to communicate to routes. #43

Open arunneoz opened 7 years ago

arunneoz commented 7 years ago

Hello,

I was able to setup the local instance and everything cameup , but iam not seeing AWS in the catalog. How can i enable the service broker for AWS services.

Regards

Arun

jwmatthews commented 7 years ago

When I see this behavior the root cause is generally that the Service Catalog failed to retrieve a list of Service Classes from the Ansible Service Broker.

My issue is normally resolved by running "sudo iptables -F". When I hit this, it appears to be a firewall issue that is blocking some of the network traffic between services, i.e. the Service Catalog is unable to hit the route for the Ansible Service Broker.

arunneoz commented 7 years ago

I will check , but the common_vars under local setup doesn't have any aws details. It might be due to that. Where do i set those to include AWS in my local instance catalog

jwmatthews commented 7 years ago

@arunneoz Note that for the local setup we are not creating the secret in aws_demo to launch the RDS, i.e. no secret is created with the AWS credentials needed to provision RDS from the webui.

still....the local setup should populate the APBs in the servicecatalog which are the Amazon services along with ManageIQ and postgres-demo

The flow is:

To debug this, I would check to see what ServiceClasses are available in the Service Catalog. Run "catctl" or kubectl with the correct .kube/config file: https://github.com/fusor/catasb/blob/master/ansible/roles/service_catalog_setup/tasks/main.yml#L125

Below is an example of what I would expect to see if things are working. $ catctl get serviceclasses NAME KIND cloudfront-apb ServiceClass.v1alpha1.servicecatalog.k8s.io elasticache-apb ServiceClass.v1alpha1.servicecatalog.k8s.io elb-apb ServiceClass.v1alpha1.servicecatalog.k8s.io emr-apb ServiceClass.v1alpha1.servicecatalog.k8s.io manageiq-apb ServiceClass.v1alpha1.servicecatalog.k8s.io postgresql-demo-apb ServiceClass.v1alpha1.servicecatalog.k8s.io rds-aurora-apb ServiceClass.v1alpha1.servicecatalog.k8s.io rds-mysql-apb ServiceClass.v1alpha1.servicecatalog.k8s.io rds-postgres-apb ServiceClass.v1alpha1.servicecatalog.k8s.io redshift-apb ServiceClass.v1alpha1.servicecatalog.k8s.io route53-apb ServiceClass.v1alpha1.servicecatalog.k8s.io

Assuming you don't see serviceclasses....next thing is to check:

Logs from ansible-service-broker below when things look good. $ oc project ansible-service-broker $ oc logs asb-1-0vlf0

172.17.0.1 - - [09/May/2017:17:56:42 +0000] "POST /v2/bootstrap HTTP/1.1" 200 22 [2017-05-09T17:56:43.017Z] [INFO] AnsibleBroker::Catalog [2017-05-09T17:56:43.017Z] [DEBUG] Dao::BatchGetRaw [2017-05-09T17:56:43.02Z] [DEBUG] Successfully loaded [ 11 ] objects from etcd dir [ /spec ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 0 ] -> [ 7f317094-45b7-47de-99cb-a8973840f4f3 ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 1 ] -> [ 64297e88-657c-46ab-bb50-2af7f84e5ec5 ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 2 ] -> [ 202a2432-1d1a-4182-aa35-00402ffc5ee3 ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 3 ] -> [ c9f8f8e1-81ff-4825-ae26-94fb12696f20 ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 4 ] -> [ 0aaafc10-132b-41a8-a58c-73268ff1006a ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 5 ] -> [ 9e8772a5-73ac-4cab-a089-a167a0d972a5 ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 6 ] -> [ b4325aea-0479-4f8d-b502-1922bbaee13e ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 7 ] -> [ 650fc7c3-12e3-4cce-b36c-2984e076768d ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 8 ] -> [ 4daf53ae-460d-4319-8698-212743a1a330 ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 9 ] -> [ 334c6ee4-ed6a-4520-9c1b-b068d8d4a7a7 ] [2017-05-09T17:56:43.02Z] [DEBUG] Batch idx [ 10 ] -> [ 9ce41143-1c76-4fc1-9828-4fbe2648ee24 ] 172.17.0.1 - - [09/May/2017:17:56:43 +0000] "GET /v2/catalog HTTP/1.1" 200 20317

Next Service Catalog:

$ oc project service-catalog $ oc logs controller-manager-420467496-mq475

.... I0509 17:56:42.983490 1 controller.go:236] Processing Broker ansible-service-broker I0509 17:56:42.983512 1 controller.go:254] Creating client for Broker ansible-service-broker, URL: http://asb-1338-ansible-service-broker.172.17.0.1.nip.io I0509 17:56:42.983522 1 controller.go:258] Adding/Updating Broker ansible-service-broker I0509 17:56:43.022614 1 utils.go:67] { "services": [ { "name": "cloudfront-apb",

I0509 17:56:43.023428 1 controller.go:268] Successfully fetched 11 catalog entries for Broker ansible-service-broker I've run into issues with my environment that the service catalog couldn't talk to the route for the ansible service broker, in my cases I ran "sudo iptables -F" and redid "reset_environment.sh" from the local directory and it resolved my issues. I typically have to run "sudo iptables -F" after I reboot my linux laptop.
jwmatthews commented 7 years ago

@arunneoz as to getting the RDS provision example to run from the local environment through the WebUI, it requires this secret to be created in the project: https://github.com/fusor/catasb/blob/master/ansible/roles/demo_prep/templates/demo-secret.j2

The issue is that we didn't have a good way to programmaticly determine {{ my_security_group_id }} for the local case.

For the ec2 case we look at the running instance and determine the security group id from that. https://github.com/fusor/catasb/blob/master/ansible/setup_environment.yml#L22

I assume you could set {{ my_security_group_id }} then include the 'demo_prep' role in the local environment.

arunneoz commented 7 years ago

When i looked into the Controller Manager. It says unable to get the catalog from the following url

E0509 19:11:07.701059 1 open_service_broker_client.go:128] Failed to fetch catalog "ansible-service-broker" from http://asb-1338-ansible-service-broker.172.17.0.1.nip.io/v2/catalog: response: error: &url.Error{Op:"Get", URL:"http://asb-1338-ansible-service-broker.172.17.0.1.nip.io/v2/catalog", Err:(*net.OpError)(0xc42032a050)} W0509 19:11:07.701161 1 controller.go:262] Error getting broker catalog for broker "ansible-service-broker": Get http://asb-1338-ansible-service-broker.172.17.0.1.nip.io/v2/catalog: dial tcp 172.17.0.1:80: getsockopt: no route to host

When i hit the url from browser , iam able to see the response.

Any thoughts, i did run sudo iptables -F and rebooted and ran reset_environment

jwmatthews commented 7 years ago

This is the exact error I see when I need to run "sudo iptables -F"

My workflow would be: 1) sudo iptables -F 2) reset_environment.sh 3) service catalog can now see ansible service broker

Whenever I reboot, I need to re-run sudo iptables -F after the reboot.

It's possible there could be another networking issues causing this to fail on local, conflicts with the docker network could be another culprint

arunneoz commented 7 years ago

That did the trick, thanks i also configured aws secret myself and provisioned the instance. Thanks for all your help

jwmatthews commented 7 years ago

The resolution to this issue was running "sudo iptables -F" prior to running oc cluster up. I've hit this issue several times on my own runs, typically after I have restarted my linux machine (running Fedora 25).

To avoid others hitting this problem we could add the iptables -F prior to running oc cluster up, or we could also experiment with modifying firewall rules such as recommended in this comment: https://github.com/openshift/origin/issues/10139#issuecomment-270503837