canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.91k stars 865 forks source link

cloud-init doesn't start when no data source is present #3772

Open ubuntu-server-builder opened 1 year ago

ubuntu-server-builder commented 1 year ago

This bug was originally filed in Launchpad as LP: #1892171

Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = None
date_created = 2020-08-19T10:16:42.752125+00:00
date_fix_committed = None
date_fix_released = None
id = 1892171
importance = wishlist
is_complete = False
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1892171
milestone = None
owner = hsunda3
owner_name = Hari Sundararajan
private = False
status = triaged
submitter = hsunda3
submitter_name = Hari Sundararajan
tags = []
duplicates = []

Launchpad user Hari Sundararajan(hsunda3) wrote on 2020-08-19T10:16:42.752125+00:00

I am using KVM (via libvirt)

  1. I take bionic-server-cloudimg-amd64.img from cloud-images.ubuntu.com

  2. I create a simple user-data file

    cloud-config

users:

  1. I create the seed ISO.

    cloud-localds --hostname TEMPLATE --verbose seed.iso user-data

  2. I boot the VM. Within the VM I run some provisioning ...

    cloud-init status --wait

    Some provisioning here

    apt-get install some devel libraries truncate -s 0 /etc/machine-id rm -rf /etc/netplan/ rm -rf /etc/ssh/sshhost cloud-init clean --seed --logs

  3. I shut down the VM. This, in theory, should become my template from which I should be able to clone new VMs, correct? However, when I clone and boot it, cloud-init does not run on the clones at all.

In /run/cloud-init/cloud.cfg I see

di_report: datasource_list: [ ]

reporting not found result. notfound=disabled


In /run/cloud-init/cloud-init-generator.log I see (among other things)

ds-identify rc=1 ds-identify_RET=notfound cloud-init is enabled but no datasource found. disabling

Finally, in /run/cloud-init/ds-identify.log I see (among other things)

DSNAME= DSLIST=NoCloud ConfigDrive OpenNebula DigitalOcean Azure AltCloud OVF MAAS GCE OpenStack CloudSigma SmartOS Bigstep Scaleway AliYun Ec2 CloudStack Hetzner IBMCloud Oracle Exoscale None Mode=search is_container=false is_ds_enabled(IBMCloud) = true ec2 platform is 'Unknown' No ds found [mode=search,notfound=disabled]. Disabled cloud-init [1]

DSLIST in the above matches what is in /etc/cloud/cloud.cfg.d/90_dpkg.cfg but it is not recognized at all.

However, as mentioned in https://bugs.launchpad.net/cloud-init/+bug/1876375 , if I do this instead


echo 'datasource_list: [ NoCloud, None ]' > /etc/cloud/cloud.cfg.d/90_dpkg.cfg

Then everything works just as I want it. The clones boot up properly, there's no need to attach any seed ISO file, we are able to login.

Why is it that when 90_dpkg.cfg contains [ NoCloud, None] the cloning process works and cloud-init starts, but when 90_dpkg.cfg has a long list of entries (all of which are irrelevant) , it chooses to not even start?

(NOTE: This is on Ubuntu 18, /usr/bin/cloud-init 20.2-45-g5f7825e2-0ubuntu1~18.04.1 from some daily build)

ubuntu-server-builder commented 1 year ago

Launchpad user Ryan Harper(raharper) wrote on 2020-08-19T17:19:29.179217+00:00

Hi Hari,

Thanks for filing a bug.

. I boot the VM. Within the VM I run some provisioning ...

Can you provide more details on how you're booting the VM? I ask because cloud-init is designed to not run unless it detects a datasource. The NoCloud datasource is detected in a few ways;

1) filesystem label found with 'cidata' or 'CIDATA' 2) DMI Product Serial includes ds=nocloud 3) /var/lib/cloud/seed/nocloud- directory exists

cloud-init clean --seed --logs

This operation will remove any seeds from /var/lib/cloud/seed/* in your image. If this was how you were telling cloud-init to run; you've now removed it and future boots of this image will not run cloud-init as you've not provided a datasource that will activate cloud-init.

Why is it that when 90_dpkg.cfg contains [ NoCloud, None] the cloning process works and cloud-init starts, but when 90_dpkg.cfg has a long list of entries (all of which are irrelevant) , it chooses to not even start?

cloud-init's reads /etc/cloud/cloud.cfg and /etc/cloud/cloud.cfg.d/*.cfg; if in those config files something sets the datasource_list to a single datasource (like you did); then cloud-init assumes that someone has configured a specific datasource and will always activate.

The goal for cloud-init is to allow an image to be re-used on any number of platforms; so if you've customized an image using NoCloud; if you take this image and booted it on Ec2 or Azure; it should work there (but will use the correct platform datasource rather than NoCloud).

I'm going to mark this bug invalid as it appears that cloud-init is working as designed but please change it back to New if you believe that cloud-init is not working as designed.

ubuntu-server-builder commented 1 year ago

Launchpad user Hari Sundararajan(hsunda3) wrote on 2020-08-20T05:43:32.430572+00:00

Can you provide more details on how you're booting the VM? I ask because cloud-init is designed to not run unless it detects a datasource

I create an image that is expected to boot in 3 locations.

Location A: Openstack Location B: An environment with CIDATA (NoCloud) Location C: An environment where there is no data source whatever. No kernel command line references to cloud-init, not CIDATA filesystem, no SMBIOS modification... nothing.

In all 3 cases, I want cloud-init to activate, because I want some cloud-init functionality (ssh key generation, DHCP request on first network interface, file system increase / growpart and so on).

Obviously, in the OpenStack environment (Location A) I want it to honor the OpenStack data source and in the Location . In Location B, I want cloud-init to honor CIDATA. In Location C, I want it to get triggered and run its functionality, but without a data source.

Is there a way to achieve this? I thought baking /etc/cloud/cloud.cfg.d/90_dpkg.cfg with

datasource_list: [ NoCloud, OpenStack, None ]

Would do this. It doesn't. Is that expected behavior?

ubuntu-server-builder commented 1 year ago

Launchpad user Ryan Harper(raharper) wrote on 2020-08-20T14:38:51.299409+00:00

I create an image that is expected to boot in 3 locations. ... Is there a way to achieve this?

Yes. You may want to add ConfigDrive to your ds list (OpenStack's may use ConfigDrive as well).

And if you always want cloud-init to run then you create:

/etc/cloud/ds-identify.cfg with content

policy: enabled

Which will enable cloud-init always. You're changes to datasource_list will tell cloud-init to only look for those specific datasources.

ubuntu-server-builder commented 1 year ago

Launchpad user Scott Moser(smoser) wrote on 2020-08-20T16:24:24.147228+00:00

ds-identify.cfg is undocumented by design.

I think that the None datasource was really a bad idea. thanks to ds-identify, it doesn't come into play except in error cases now.

If we were in the need of supporting somethign like Hari is after, I'd suggest a new datasource that didn't spew warnings about "something must have gone wrong". And I would not enable it by default in cloud-init. Then, if enabled explicitly it could be last in the order.

Even then, though it would be difficult as in order to generate networking configuration ("dhcp on eth0") it would need to run at the local stage. I think to accomplish it, we'd have to have all datasources moved to local.

ubuntu-server-builder commented 1 year ago

Launchpad user Hari Sundararajan(hsunda3) wrote on 2020-08-20T16:35:57.325810+00:00

Perfect, thank you for the explanation. Apologies for the churn, my understanding of the documentation was wrong.

policy:enabled indeed lets cloud-init run everywhere, and even though in my environment with no data source, it spins a while trying to access AWS end points, it still runs so that satisfies my requirements.

thank you!

ubuntu-server-builder commented 1 year ago

Launchpad user Paride Legovini(paride) wrote on 2020-08-21T10:44:52.724343+00:00

Thanks Hari, Scott and Ryan. I'm happy that Hari has a working solution, but the discussion above has useful pointer on things we can improve (see Scott's last comment), and I think the documentation can be improved too. I'm marking this bug as Triaged with low importance, so hopefully we won't lose track of it.

ubuntu-server-builder commented 1 year ago

Launchpad user Scott Moser(smoser) wrote on 2020-08-21T13:50:54+00:00

I'd just like to explicitly say that documenting ds-identify.cfg is not desirable . ds-identify does not have stable and user-modifyable confgiuration.

On Fri, Aug 21, 2020 at 6:50 AM Paride Legovini 1892171@bugs.launchpad.net wrote:

Thanks Hari, Scott and Ryan. I'm happy that Hari has a working solution, but the discussion above has useful pointer on things we can improve (see Scott's last comment), and I think the documentation can be improved too. I'm marking this bug as Triaged with low importance, so hopefully we won't lose track of it.

** Changed in: cloud-init Status: Incomplete => Triaged

** Changed in: cloud-init Importance: Undecided => Wishlist

-- You received this bug notification because you are subscribed to cloud- init. https://bugs.launchpad.net/bugs/1892171

Title: cloud-init doesn't start when no data source is present

To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1892171/+subscriptions

ubuntu-server-builder commented 1 year ago

Launchpad user Hari Sundararajan(hsunda3) wrote on 2020-08-21T14:32:01.709826+00:00

I'd suggest a new datasource that didn't spew warnings about "something must have gone wrong". And I would not enable it by default in cloud-init. Then, if enabled explicitly it could be last in the order.

To be honest, this was my original thought process. I thought by including "None" in /etc/cloud/cloud.cfg.d/90_dpkg.cfg, the "None" data source gets "enabled" and runs upon detecting nothing else. Also, https://cloudinit.readthedocs.io/en/latest/topics/datasources/fallback.html says "This is the fallback datasource when no other datasource can be selected" .

I am not familiar with the inner workings, so I can not comment on the challenges involved with this (or address your comments like "having to have all datasources moved to local") , but given all the useful things cloud-init does, I would definitely appreciate a data source that as though a empty string was given as user data in the absence of anything