submariner-io / submariner

Networking component for interconnecting Pods and Services across Kubernetes clusters.
https://submariner.io
Apache License 2.0
2.43k stars 193 forks source link

The submariner-gateway gives up too early waiting on a network loadbalancer to be ready #1484

Closed mangelajo closed 2 years ago

mangelajo commented 3 years ago

What happened:

+ exec submariner-gateway -v=2 -alsologtostderr
I0714 08:10:06.223429       1 main.go:92] Starting the submariner gateway engine
W0714 08:10:06.223764       1 client_config.go:608] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0714 08:10:06.226963       1 main.go:120] Creating the cable engine
I0714 08:10:06.244225       1 public_ip.go:124] Waiting for LoadBalancer to be ready: error resolving DNS hostname "ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com" for public IP: lookup ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com on 10.0.0.2:53: no such host
I0714 08:10:07.249069       1 public_ip.go:124] Waiting for LoadBalancer to be ready: error resolving DNS hostname "ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com" for public IP: lookup ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com on 10.0.0.2:53: no such host
I0714 08:10:09.279598       1 public_ip.go:124] Waiting for LoadBalancer to be ready: error resolving DNS hostname "ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com" for public IP: lookup ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com on 10.0.0.2:53: no such host
I0714 08:10:13.289026       1 public_ip.go:124] Waiting for LoadBalancer to be ready: error resolving DNS hostname "ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com" for public IP: lookup ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com on 10.0.0.2:53: no such host
E0714 08:10:13.289054       1 public_ip.go:80] Error resolving public IP with resolver lb:submariner-gateway : error resolving DNS hostname "ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com" for public IP: lookup ad9858e3e23784720b6e2595b1824bd4-d594a91fa7b17798.elb.us-east-2.amazonaws.com on 10.0.0.2:53: no such host
F0714 08:10:13.289085       1 main.go:133] Error creating local endpoint object from types.SubmarinerSpecification{ClusterCidr:[]string{"10.132.0.0/14"}, ColorCodes:[]string{"blue"}, GlobalCidr:[]string{}, ServiceCidr:[]string{"172.31.0.0/16"}, Broker:"k8s", CableDriver:"libreswan", ClusterID:"majopela-b--1626247420", Namespace:"submariner-operator", PublicIP:"lb:submariner-gateway", Token:"", Debug:false, NATEnabled:true, HealthCheckEnabled:true, HealthCheckInterval:0x1, HealthCheckMaxPacketLossCount:0x5}: could not determine public IP: Unable to resolve public IP by any of the resolver methods: [lb:submariner-gateway]

A LoadBalancer needs more time setting up in AWS, and as the backing endpoint (submariner-gateway) comes and goes because it's crashing, the LoadBalancer AWS controller has issues configuring it.

What you expected to happen:

It works better ensuring that the submariner-gateway waits enough time.

How to reproduce it (as minimally and precisely as possible):

Deploy with the --load-balancer mode on AWS

Anything else we need to know?:

Environment:

dfarrell07 commented 2 years ago

We think this was fixed by https://github.com/submariner-io/submariner/pull/1483, @aswinsuryan hasn't seen it in his recent testing.