kubeadm fails with no ipv4 address on the network interface with defaultroute

scheuk commented 6 years ago

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT

Versions

kubeadm version (use kubeadm version): kubeadm version: &version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:14:39Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Kubernetes version (use kubectl version): Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Cloud provider or hardware configuration: Hardware

OS (e.g. from /etc/os-release):


VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"


- **Kernel** (e.g. `uname -a`):
`Linux node1-lab-a1-01 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux`

- **Others**:
Our hosts are setup using IPv4 BGP over IPv6 (rfc5549: https://tools.ietf.org/html/rfc5549).
The host ip address is attached to a loopback address and FRR bgp announces that IP to connected TOR switches (spine and leaf fabric). There is no IPv4 address on the connected interfaces, but I do have a default route that allows access to the world:

```# ip route
default proto bgp metric 20
    nexthop via 169.254.0.1 dev em1 weight 1 onlink
    nexthop via 169.254.0.1 dev em2 weight 1 onlink
10.101.155.0/24 proto bgp metric 20
    nexthop via 169.254.0.1 dev em1 weight 1 onlink
    nexthop via 169.254.0.1 dev em2 weight 1 onlink
10.101.246.0/24 dev em3 proto kernel scope link src 10.101.246.11
169.254.0.0/16 dev em3 scope link metric 1002
169.254.0.0/16 dev em1 scope link metric 1003
169.254.0.0/16 dev em2 scope link metric 1005
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

What happened?

Having a similar problem as #982

running

# kubeadm config images pull
unable to select an IP from default routes.

# kubeadm config images pull -v 10
I1003 20:56:35.880474   87226 interface.go:360] Looking for default routes with IPv4 addresses
I1003 20:56:35.880550   87226 interface.go:365] Default route transits interface "em1"
I1003 20:56:35.881831   87226 interface.go:174] Interface em1 is up
I1003 20:56:35.881925   87226 interface.go:222] Interface "em1" has 1 addresses :[fe80::266e:96ff:fe5f:7b48/64].
I1003 20:56:35.881957   87226 interface.go:189] Checking addr  fe80::266e:96ff:fe5f:7b48/64.
I1003 20:56:35.881989   87226 interface.go:202] fe80::266e:96ff:fe5f:7b48 is not an IPv4 address
I1003 20:56:35.882027   87226 interface.go:360] Looking for default routes with IPv6 addresses
I1003 20:56:35.882051   87226 interface.go:376] No active IP found by looking at default routes
unable to select an IP from default routes.

Same error happens with doing kubeadm init

What you expected to happen?

I expect kubeadm to work and pull the images or perform an init. Maybe have a way to specify my hosts's ip address or interface.

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

timothysc commented 6 years ago

@bart0sh @kad - Do you have any network setups that are similar to this?

timothysc commented 6 years ago

@scheuk what's the local resolvable IPv4 address for that host?
Can you add it to /etc/hosts, and specify to kubeadm config?

/cc @kubernetes/sig-network-bugs

rosskukulinski commented 6 years ago

@timothysc do we know if the api-server will work with this kind of environment? I'm just wondering if we get kubeadm to work in this network setup, are we going to run into more problems down the road?

timothysc commented 6 years ago

If you have a specific ethernet adaptor that wraps the details and you can bind to, or /etc/hosts override I would think it should "just work".

scheuk commented 6 years ago

@timothysc my local host IP sits on lo0:

# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 10.101.228.11/32 brd 10.101.228.11 scope global lo:0
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

the 10.101.228.11 ip address

I've spent the morning makingkubeadm init work. I was able to add this to the config and init worked like a charm.

api:
  advertiseAddress: 10.101.228.11

However I'm still stuck needing a route to perform both the kubeadm config images pull command as well as kubeadm token create --print-join-command to get the join command to add worker nodes to the cluster.

scheuk commented 6 years ago

@timothysc Can you explain the /etc/hosts override?

I just attempted to add 10.101.228.11 hostname to /etc/hosts. but kubeadm is still looking at the interface with the route.

timothysc commented 6 years ago

So this has todo with the specifics of your networking configuration, coupled with the default behavior of the system.

I'm fairly certain if you were to make a bridged adapter to bind the IPv4 address to, vs. being bound to loopback, it would detect properly.

I'm digging right now to see if there is a global override for the nic that works across *.

timothysc commented 6 years ago

So it would be a combination of --api-advertise-addresses and --hostname-override , I'm not certain all those options percolate to all the subcommands, still digging.

timothysc commented 6 years ago

@scheuk could you run

kubeadm config images pull --config yourconfig.yaml

with your config file specifying: advertiseAddress

timothysc commented 6 years ago

You may need to pass in --config for every sub-command b/c the way your network is set up.

mauilion commented 6 years ago

you can run kubeadm token create --print-join-command from any host or a pipeline with the --kubeconfig flag.

A possible work around for the images pull thing given the config would be something like: kubeadm config images list | xargs -n1 -I {} docker pull {} run prior to init

scheuk commented 6 years ago

@timothysc @mauilion

No go on --kubeconfig:

# kubeadm token create --print-join-command --kubeconfig /etc/kubernetes/admin.conf
unable to select an IP from default routes.

also if I try with --config it says you can't combine those two:

# kubeadm token create --print-join-command --config /etc/kubernetes/kubeadm.conf
can not mix '--config' with arguments [print-join-command]

This worked, I'll update my deployment to do this

# kubeadm config images pull --config /etc/kubernetes/kubeadm.conf

mauilion commented 6 years ago

I mean that you can use kubeadm token create from a machine with a default route like your laptop as long as it has access to the running apiserver and an admin level kubeconfig

timothysc commented 6 years ago

@scheuk

# kubeadm token create --print-join-command --config /etc/kubernetes/kubeadm.conf 
can not mix '--config' with arguments [print-join-command]

is a minor bug that we can fix in 1.13

Are you still blocked?

scheuk commented 6 years ago

@timothysc

I am still blocked. We use ansible to perform all these steps and I currently execute kubeadm token create --print-join-command on the first master to get the command to join the worker nodes to the cluster. However I may be able to temporarily unblock myself by doing what @mauilion says, and setup kubeadm locally (where ansible is run from) to perform that action.

Thanks for all the help so far!

timothysc commented 6 years ago

We use ansible to perform all these steps and I currently execute kubeadm token create --print-join-command on the first master to get the command to join the worker nodes to the cluster.

The output of init contains the command you should execute on the other nodes.

mauilion commented 6 years ago

This normally done in ansible as the token is short lived and it's easier to capture the join command output from token create rather than init.

scheuk commented 6 years ago

Also it's a little bit harder to parse with all the other output and spacing ;)

timothysc commented 6 years ago

Also it's a little bit harder to parse with all the other output and spacing ;)

I've done so much sed & awk in my life I'm probably too desensitized ;-)

scheuk commented 6 years ago

So my localhost where I would run kubeadm from ansible is a mac. From the kubeadm install page, it doesn't support mac os x.

mauilion commented 6 years ago

docker run --net=host --rm -v /path/to/kubeconfig:/kubeconfig quay.io/mauilion/kubeadm:v1.11.3  kubeadm token create --print-join-command --kubeconfig=/kubeconfig

output:
kubeadm join 10.192.0.2:6443 --token iwikby.5u4wc05jnbdldq5e --discovery-token-ca-cert-hash sha256:f19311dfe7034d14c48002fd4f29e285270a573b9e9066735d5749ca89b9c89f

:)

scheuk commented 6 years ago

@mauilion duh! but That's awesome you have a kubeadm conatiner :)

scheuk commented 6 years ago

I just updated my ansible to execute: docker run --rm -v /etc/kubernetes/admin.conf:/kubeconfig quay.io/mauilion/kubeadm:v1.11.3 kubeadm token create --print-join-command --kubeconfig=/kubeconfig 2>/dev/null on the first master. Let the container use the docker networking instead of host networking to mask my network setup.

mauilion commented 6 years ago

nice

mauilion commented 6 years ago

Are you unblocked @scheuk

scheuk commented 6 years ago

Yes I am unblocked now! Thanks for all the help!

kad commented 6 years ago

@bart0sh @kad - Do you have any network setups that are similar to this?

We don't have exactly like that, but I think we can simulate simpler but close to that setup.

kad commented 6 years ago

@scheuk can you show examples what do you have in your ipv6 routing table and what type of ipv4/ipv6 addresses you have on your em* interfaces ? (no need to show real IPs, you can obfuscate, just would like to see what kind of unicast addresses might be present on your setup, beside link-local addresses).

scheuk commented 6 years ago

@kad our network team has quite moved to ipv6 yet, but they have bought into cumulus linux routing on the host using the ipv6 link local addresses and the RFC mentioned above.
here a link to how it works: https://docs.cumulusnetworks.com/display/ROH/Routing+on+the+Host under the BGP and OSPF Unnumbered Interfaces section.

Attached here is the output if my ipv6 routing table and interfaces:

# ip -6 route
unreachable ::/96 dev lo metric 1024 error -113 pref medium
unreachable ::ffff:0.0.0.0/96 dev lo metric 1024 error -113 pref medium
unreachable 2002:a00::/24 dev lo metric 1024 error -113 pref medium
unreachable 2002:7f00::/24 dev lo metric 1024 error -113 pref medium
unreachable 2002:a9fe::/32 dev lo metric 1024 error -113 pref medium
unreachable 2002:ac10::/28 dev lo metric 1024 error -113 pref medium
unreachable 2002:c0a8::/32 dev lo metric 1024 error -113 pref medium
unreachable 2002:e000::/19 dev lo metric 1024 error -113 pref medium
unreachable 3ffe:ffff::/32 dev lo metric 1024 error -113 pref medium
fe80::/64 dev fabric0 proto kernel metric 256 mtu 9000 pref medium
fe80::/64 dev em1 proto kernel metric 256 pref medium
fe80::/64 dev em2 proto kernel metric 256 pref medium
fe80::/64 dev em3 proto kernel metric 256 pref medium
fe80::/64 dev docker0 proto kernel metric 256 pref medium

# ip addr show em1
4: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether 24:6e:96:5f:7b:48 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::266e:96ff:fe5f:7b48/64 scope link
       valid_lft forever preferred_lft forever

# ip addr show em2
5: em2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether 24:6e:96:5f:7b:4a brd ff:ff:ff:ff:ff:ff
    inet6 fe80::266e:96ff:fe5f:7b4a/64 scope link
       valid_lft forever preferred_lft forever

kad commented 6 years ago

@kad our network team has quite moved to ipv6 yet, but they have bought into cumulus linux routing on the host using the ipv6 link local addresses and the RFC mentioned above. here a link to how it works: https://docs.cumulusnetworks.com/display/ROH/Routing+on+the+Host under the BGP and OSPF Unnumbered Interfaces section.

@scheuk thanks for clarifications. So, to summarize whole picture (to see, if I comprehend your setup completely):

your host has on lo interface one (or more?) /32 ipv4 addresses.
you don't have unicast ipv6 /128 addresses on lo (similar as for ipv4 scenarios)
network interfaces use link-locals on bofh ipv4/ipv6.
ospf/bgp used to announce real routes to host and host's /32 or /128 are announced back to TORs.

Is that correct ?

if that is correct above, can you share one more output of ip ro get 8.8.8.8 ? (instead of 8.8.8.8 you can use any of unicast IP. I'm trying to understand what kernel will use as outgoing source address for default route and routes that you got over ospf/bgp, outside of your cluster IP range).

kad commented 6 years ago

@scheuk you can try patch from #69578 to see, if it works in your setup. If you need some help, I can provide built binary with that patch applied.

scheuk commented 6 years ago

@kad your understanding of our setup is correct. BGP does announce other routes as well, it's configured to pick up local blackhole routes and announce them, but that is for POD connectivity vs host connectivity.

Here's the output of ip ro get on the host:

# ip ro get 8.8.8.8
8.8.8.8 via 169.254.0.1 dev em2 src 10.101.228.11
    cache

I'll attempt to test the patch from #69578 as well and let you know

scheuk commented 6 years ago

@kad can you send me a binary, might take less time then me setting up go/figuring how to add a patch :)

kad commented 6 years ago

@kad can you send me a binary, might take less time then me setting up go/figuring how to add a patch :)

try http://orava.kad.name/kubeadm/kubeadm-69578 This kubeadm is built out of master branch. but minimally it should be ok for trying in your setup.

scheuk commented 6 years ago

@kad looking good:

# ./kubeadm-69578 config images pull -v 10
I1009 21:53:05.336396   47234 interface.go:384] Looking for default routes with IPv4 addresses
I1009 21:53:05.336485   47234 interface.go:389] Default route transits interface "em1"
I1009 21:53:05.337591   47234 interface.go:196] Interface em1 is up
I1009 21:53:05.337687   47234 interface.go:244] Interface "em1" has 1 addresses :[fe80::266e:96ff:fe5f:7b48/64].
I1009 21:53:05.337721   47234 interface.go:211] Checking addr  fe80::266e:96ff:fe5f:7b48/64.
I1009 21:53:05.337742   47234 interface.go:224] fe80::266e:96ff:fe5f:7b48 is not an IPv4 address
I1009 21:53:05.337768   47234 interface.go:398] Default route exists for IPv4, but interface "em1" does not have unicast addresses. Checking loopback interface
I1009 21:53:05.338779   47234 interface.go:196] Interface lo is up
I1009 21:53:05.338884   47234 interface.go:244] Interface "lo" has 4 addresses :[127.0.0.1/8 10.101.228.11/32 192.0.2.1/24 ::1/128].
I1009 21:53:05.338918   47234 interface.go:211] Checking addr  127.0.0.1/8.
I1009 21:53:05.338958   47234 interface.go:221] Non-global unicast address found 127.0.0.1
I1009 21:53:05.338977   47234 interface.go:211] Checking addr  10.101.228.11/32.
I1009 21:53:05.338995   47234 interface.go:218] IP found 10.101.228.11
I1009 21:53:05.339025   47234 interface.go:250] Found valid IPv4 address 10.101.228.11 for interface "lo".
I1009 21:53:05.339044   47234 interface.go:404] Found active IP 10.101.228.11 on loopback interface
I1009 21:53:05.339186   47234 version.go:156] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.txt
I1009 21:53:05.687024   47234 feature_gate.go:206] feature gates: &{map[]}

kad commented 6 years ago

good. so, please comment on PR :)

timothysc commented 6 years ago

/cc @rdodev - regarding cli-arg issue(s).

neolit123 commented 5 years ago

related PR for this is in flight by @kad but reviews are pending: https://github.com/kubernetes/kubernetes/pull/69578

timothysc commented 5 years ago

/assign @rdodev

Lets chat in the morning on this one.

rdodev commented 5 years ago

kubeadm token create --print-join-command --config /etc/kubernetes/kubeadm.conf can not mix '--config' with arguments [print-join-command]

In terms of cli this has already been taken care of @timothysc

https://github.com/kubernetes-csi/driver-registrar/blob/87d0059110a8b4a90a6d2b5a8702dd7f3f270b80/vendor/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/validation/validation.go#L375

timothysc commented 5 years ago

gr8!

dancarneiro commented 3 years ago

Just put the master's ip in place of $ (hostname -i) For exemple: kubeadm init --apiserver-advertise-address 192.168.1.2

rijuchatterjee commented 1 year ago

kubeadm init --apiserver-advertise-address even kubeadm join does not work. Has anyone been able to join to a cluster using --apiserver-advertise-address

asher-lab commented 1 year ago

I fixed mine by not using 127.0.x.x as a --apiserver-advertise-address and --apiserver-cert-extra-sans=$IPADDR

try to use 10.0.0.10

Example:

IPADDR="10.0.0.10"
NODENAME=$(hostname -s)
POD_CIDR="192.168.0.0/16"

sudo kubeadm init --apiserver-advertise-address=$IPADDR  --apiserver-cert-extra-sans=$IPADDR  --pod-network-cidr=$POD_CIDR --node-name $NODENAME --ignore-preflight-errors Swap

kubernetes / kubeadm