rancher / os

Tiny Linux distro that runs the entire OS as Docker containers
https://rancher.com/docs/os/v1.x/en/
Apache License 2.0
6.44k stars 660 forks source link

Can't fetch cloud-config file #3075

Open tanshaolong opened 2 years ago

tanshaolong commented 2 years ago

RancherOS Version: (ros os version) DISTRIB_ID=RancherOS DISTRIB_RELEASE=v1.5.4 DISTRIB_DESCRIPTION="RancherOS v1.5.4" Where are you running RancherOS? (docker-machine, AWS, GCE, baremetal, etc.) baremetal

Hi everybody:

I launched a rancher, but it failed to fetch the cloud-config file. But I login the rancher os I can use wget to get the file. Who know how to resolve the issue? Thank you. I check the rancher DNS. It set for 8.8.8.8 and 8.8.4.4. As the baremetal network, it can't reach the DNS. Is it the root cause for the issue?

The below the part log in cloud-init-save.log. If you need the more detail, please let me know.

''' time="2022-04-01T08:21:10Z" level=debug msg="runCmds(on ): []" time="2022-04-01T08:21:10Z" level=debug msg="applyOuter(false, false), link: eth0" time="2022-04-01T08:21:10Z" level=debug msg="Config(eth0): netconf.InterfaceConfig{Match:"eth0", DHCP:true, DHCPArgs:"", Address:"", Addresses:[]string(nil), IPV4LL:false, Gateway:"", GatewayIpv6:"", MTU:0, Bridge:"", Bond:"", BondOpts:map[string]string(nil), PostUp:[]string(nil), PreUp:[]string(nil), Vlans:"", WifiNetwork:""}" time="2022-04-01T08:21:10Z" level=debug msg="runCmds(on eth0): []" time="2022-04-01T08:21:10Z" level=debug msg="runCmds(on eth0): []" time="2022-04-01T08:21:10Z" level=debug msg="applyOuter(false, false), link: eth1" time="2022-04-01T08:21:10Z" level=debug msg="applyOuter(false, false), link: eth2" time="2022-04-01T08:21:10Z" level=debug msg="applyOuter(false, false), link: eth3" time="2022-04-01T08:21:10Z" level=info msg="Running DHCP on eth0: dhcpcd -MA4 -e force_hostname=true --timeout 10 -w --debug eth0" time="2022-04-01T08:21:20Z" level=info msg="Checking to see if DNS was set by DHCP" time="2022-04-01T08:21:20Z" level=info msg="dns testing eth0" time="2022-04-01T08:21:20Z" level=debug msg="Running cmd: [dhcpcd -MA4 -U eth0], output: " time="2022-04-01T08:21:20Z" level=warning msg="Failed to run cmd: [dhcpcd -MA4 -U eth0], error: exit status 1" time="2022-04-01T08:21:20Z" level=debug msg="getDhcpLease eth0: " time="2022-04-01T08:21:20Z" level=debug msg="line: []" time="2022-04-01T08:21:20Z" level=info msg="dns testing eth1" time="2022-04-01T08:21:20Z" level=debug msg="Running cmd: [dhcpcd -MA4 -U eth1], output: " time="2022-04-01T08:21:20Z" level=warning msg="Failed to run cmd: [dhcpcd -MA4 -U eth1], error: exit status 1" time="2022-04-01T08:21:20Z" level=debug msg="getDhcpLease eth1: " time="2022-04-01T08:21:20Z" level=debug msg="line: []" time="2022-04-01T08:21:20Z" level=info msg="dns testing eth2" time="2022-04-01T08:21:20Z" level=debug msg="Running cmd: [dhcpcd -MA4 -U eth2], output: " time="2022-04-01T08:21:20Z" level=warning msg="Failed to run cmd: [dhcpcd -MA4 -U eth2], error: exit status 1" time="2022-04-01T08:21:20Z" level=debug msg="getDhcpLease eth2: " time="2022-04-01T08:21:20Z" level=debug msg="line: []" time="2022-04-01T08:21:20Z" level=info msg="dns testing eth3" time="2022-04-01T08:21:20Z" level=debug msg="Running cmd: [dhcpcd -MA4 -U eth3], output: " time="2022-04-01T08:21:20Z" level=warning msg="Failed to run cmd: [dhcpcd -MA4 -U eth3], error: exit status 1" time="2022-04-01T08:21:20Z" level=debug msg="getDhcpLease eth3: " time="2022-04-01T08:21:20Z" level=debug msg="line: []" time="2022-04-01T08:21:20Z" level=debug msg="runCmds(on ): []" time="2022-04-01T08:21:20Z" level=info msg="Writing default resolv.conf - no user setting, and no DHCP setting" time="2022-04-01T08:21:20Z" level=debug msg="Resolve.conf == [nameserver 8.8.8.8 nameserver 8.8.4.4 ], " time="2022-04-01T08:21:20Z" level=info msg="Apply Network Config SyncHostname" time="2022-04-01T08:21:20Z" level=info msg="datasources that will be considered: []string{"url:http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af"}" time="2022-04-01T08:21:20Z" level=info msg="cloud-init: Checking availability of "url"" time="2022-04-01T08:21:20Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #1" time="2022-04-01T08:21:20Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:20Z" level=debug msg="Sleeping for 100ms..." time="2022-04-01T08:21:20Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #2" time="2022-04-01T08:21:20Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:20Z" level=debug msg="Sleeping for 200ms..." time="2022-04-01T08:21:20Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #3" time="2022-04-01T08:21:20Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:20Z" level=debug msg="Sleeping for 400ms..." time="2022-04-01T08:21:21Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #4" time="2022-04-01T08:21:21Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:21Z" level=debug msg="Sleeping for 800ms..." time="2022-04-01T08:21:22Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #5" time="2022-04-01T08:21:22Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:22Z" level=debug msg="Sleeping for 1.6s..." time="2022-04-01T08:21:23Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #6" time="2022-04-01T08:21:23Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:23Z" level=debug msg="Sleeping for 3.2s..." time="2022-04-01T08:21:26Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #7" time="2022-04-01T08:21:26Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:26Z" level=debug msg="Sleeping for 5s..." time="2022-04-01T08:21:31Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #8" time="2022-04-01T08:21:31Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:31Z" level=debug msg="Sleeping for 5s..." time="2022-04-01T08:21:36Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #9" time="2022-04-01T08:21:36Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:36Z" level=debug msg="Sleeping for 5s..." time="2022-04-01T08:21:41Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #10" time="2022-04-01T08:21:41Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:41Z" level=debug msg="Sleeping for 5s..." time="2022-04-01T08:21:46Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #11" time="2022-04-01T08:21:46Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:46Z" level=debug msg="Sleeping for 5s..." time="2022-04-01T08:21:51Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #12" time="2022-04-01T08:21:51Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:51Z" level=debug msg="Sleeping for 5s..." time="2022-04-01T08:21:56Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #13" time="2022-04-01T08:21:56Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:21:56Z" level=debug msg="Sleeping for 5s..." time="2022-04-01T08:22:01Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #14" time="2022-04-01T08:22:01Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:22:01Z" level=debug msg="Sleeping for 5s..." time="2022-04-01T08:22:06Z" level=debug msg="Fetching data from http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af. Attempt #15" time="2022-04-01T08:22:06Z" level=debug msg="Unable to fetch data: Get http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af: dial tcp 10.10.105.97:9030: connect: network is unreachable" time="2022-04-01T08:22:06Z" level=debug msg="Sleeping for 5s..." time="2022-04-01T08:22:11Z" level=info msg="cloud-init: Datasource unavailable, skipping: url: http://10.10.105.97:9030/api/current/templates/cloud-config.yaml?nodeId=6246b4c687617faa219d56af (lastError: Unable to fetch data. Maximum retries reached: 15)" '''

tanshaolong commented 2 years ago

Add ipxe content:

kernel <%=kernelUri%> initrd <%=initrdUri%> imgargs <%=kernelFile%> initrd=<%=initrdFile%> rancher.network.interfaces.eth0.dhcp=true console=tty0 netconsole=+@/,514@<%=server%>/ rancher.password=monorail rancher.cloud_init.datasources=['url:http://<%=server%>:<%=port%>/api/current/templates/cloud-config.yaml?nodeId=<%=nodeId%>'] rancher.debug=true boot || prompt --key 0x197e --timeout 2000 Press F12 to investigate || exit shell

tanshaolong commented 2 years ago

I guess, the network is unstable when the rancher fetch the cloud-config. Maybe the more waiting time can avoid the issue. could you please give some suggest how to add the times of retry? Thanks