clearlinux / micro-config-drive

An alternative and small cloud-init implementation in C
Other
46 stars 17 forks source link

Wait for active network #25

Closed ahkok closed 5 years ago

ahkok commented 6 years ago

packages and package_update require an active connection to be established before they can function. Either these keywords require that execution of the cloud-config statements is paused until a network is available, or the generic system unit needs to requires=/after= the network-online.target.

obedmr commented 5 years ago

@ahkok this is an issue I was having this morning, I basically need to install python bundle in order to enter with my ansible configuration management. The thing is that network is not ready when the user-data is being run.

So, my commands swupd update ; swupd bundle-add python3 hang because no network, at the end it fails and continue with the rest, but it stopped the ssh service initialization, it means I couldnt log until that swupd command finishes.

ahkok commented 5 years ago

Can you post the exact user-data file? ordering is important, you may be able to reorder things to avoid the wait for ssh to be active.

obedmr commented 5 years ago

My user data-data is:

#!/bin/bash

swupd update
swupd bundle-add python3-basic

In other cloud images, it just works at the moment that cloud-unit runs the user-data. Maybe, making sure network is ready before running the user data in ucd

ahkok commented 5 years ago

Missing ! in your user-data, it's therefore invalid.

obedmr commented 5 years ago

@ahkok it's a typo, I'm sending the right script.

ahkok commented 5 years ago

btw you shouldn't use this type. Preferably, you use something like this:

#cloud-config
package_upgrade: yes
packages: python3
obedmr commented 5 years ago

@ahkok, ok thanks, that solves the current issue of installing python3 in my cloud image. But, it's not fixing the real issue, it will fail if your user-data requires something from the internet.

ahkok commented 5 years ago

no disagreement there.

obedmr commented 5 years ago

I just tested the #cloud-config approach and still failing in the same network issue.

Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335879] userdata: Shebang found #cloud-config
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335883] Parsing user data file /var/lib/cloud/userdata-3lscBf
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335923] Loaded handler for block "package_upgrade"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335928] Loaded handler for block "write_files"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335931] Loaded handler for block "packages"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335933] Loaded handler for block "groups"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335934] Loaded handler for block "users"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335936] Loaded handler for block "ssh_authorized_keys"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335938] Loaded handler for block "service"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335940] Loaded handler for block "hostname"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335942] Loaded handler for block "runcmd"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335944] Loaded handler for block "envar"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335946] Loaded handler for block "fbootcmd"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335949] Executing handler for block "package_upgrade"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335951] package_upgrade: System Software Update Handler running...
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335953] package_upgrade: Performing system software update.
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.335959] lib: Executing: /bin/sh -c "/usr/bin/swupd update"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 dbus-daemon[159]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.13' (uid=0 pid=220 comm="/usr/bin/hostnamectl set-hostname k8s-c
lear-pmem-m")
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 systemd[1]: Starting Hostname Service...
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.351749] lib: Command failed
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.351760] lib: STD Error: Curl error: (6) Couldn't resolve host name
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: Curl error: (6) Couldn't resolve host name
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: Updater failed to initialize, exiting now.
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.351763] lib: STD output:
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.351768] Executing handler for block "packages"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.351770] packages: Packages Handler running...
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.351773] packages: Installing python3..
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.351782] lib: Executing: /bin/sh -c "/usr/bin/swupd bundle-add python3"
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.360618] lib: Command failed
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.360632] lib: STD Error: Curl error: (6) Couldn't resolve host name
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: Curl error: (6) Couldn't resolve host name
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: Failed updater initialization, exiting now.
Dec 19 17:47:03 clr-488c062a95d44c2ba65401ad148b2f20 ucd[157]: [1.360635] lib: STD output:

I'm using this user-data:

#cloud-config
package_upgrade: yes
packages: python3
obedmr commented 5 years ago

I'm actually seeing that there's not python3 bundle, but it's the same if I use the pythob3-basic bundle name. It's also failing before in the first swupd update command.

ahkok commented 5 years ago

obviously, yes. ucd is meant to be executed as fast as possible, so we never envisioned waiting for network. I think if this is a requirement, it'll have to be done through some sort of wait-for-network: yes option that forces this and is optional.

ahkok commented 5 years ago

actually, packages and package_upgrade could enable this option of course. With proper docs...

ahkok commented 5 years ago

Now the question is how do we legitimately determine that the network is active, since, we should attempt to make a neutral method that supports both NM and systemd-networkd.

We could probe DNS for download.clearlinux.org? This would be independent of any network config stack.

ahkok commented 5 years ago

wait_for_network since - isn't usable in yaml.

ahkok commented 5 years ago

I've made a PoC yesterday, so, this is in progress...

ahkok commented 5 years ago

v41 closes this - 4364c94..85747cb

obedmr commented 5 years ago

@ahkok thanks, @chuyd can you take a look?

ahkok commented 5 years ago

Found a bug in this. v42 may be needed.