Open rjschwei opened 7 months ago
I think the issue occurs here https://github.com/canonical/cloud-init/blob/main/cloudinit/sources/DataSourceOpenStack.py#L159 but the call to log_time
is inside a try-except
block and the exception is supposed to handle InvalidMetaDataException
which is raised by _crawl_metadata
with the message No active metadata service found
which is in the log. So I do not understand why we would still end up with the traceback.
Ugh, what a mess. This should probably not be the default behavior.
This then triggers the enablement of cloud-init services and as such the execution of the Python code.
Auto-enabling on all non-x86 is really not good. This bug is an example of why this permissive optimism was a bad default.
The only other users of DS_MAYBE
in cloud-init, AltCloud and Ec2, only occur in much more limited environments: after positive match on a DMI value or when it is explicitly enabled by a configuration value, respectively.
The more that I think about this the more I think that it was a mistake to try to "just work" for openstack on other architectures without a positive signal.
I'm talking with some openstack folks in the meantime to try to get better openstack support for cloud-init on a few architectures, but I think we should consider making this non-default in a an upcoming cloud-init release. Users that want to use cloud-init on non-x86 can always select openstack in cloud.cfg or in their kernel commandline. Breaking users on some arches just so that other users on those same arches don't have to set a configuration value seems like a poor tradeoff - especially when the tradeoff is caused by a shortcoming of the cloud. Perhaps we should just try to fix openstack instead of depend on broken hacks like this. I think we can get openstack to pass DMI data on a few more arches than it already does, or alternatively it could probably even set the datasource in the kernel commandline.
Related bug report:
Bug report
The ds-identify script guesses that we may be in an OpenStack environment on non x86_64 architectures, in this case aarch 64 [1]. This then triggers the enablement of cloud-init services and as such the execution of the Python code. When no data source if found by the OpenStack data source implementation an exception trickles to the top causing a traceback.
and
The exception should be handled and no traceback should be generated.
[1] https://github.com/canonical/cloud-init/blob/main/tools/ds-identify#L1370
Steps to reproduce the problem
Run a VM on aarch64 with cloud-init default config and no config drive.
Environment details
cloud-init logs
cloud-init.tar.gz