Open ubuntu-server-builder opened 1 year ago
Launchpad user Brett Holman(holmanb) wrote on 2022-07-19T15:26:12.089830+00:00
Thanks for reporting! Unfortunately our Gentoo test coverage is not very comprehensive.
I think this was likely broken in 81299de5fe3b6e491a965a6ebef66c6b8bf2c037.
That commit removed _write_network_config changed the behavior. Previously all distros implemented _write_network_config() which in Gentoo threw a NotImplemented exception prior to that commit (default behavior from the base class). This was then caught in cloudinit/distros/init.py:Distro.apply_network_config(), which caused the following call path:
apply_network_config() -> _apply_network_from_network_config() -> apply_network() -> which would have called Gentoo's networking code.
Currently _write_network_state() calls renderers.select(), which raises RendererNotFoundError and causes the error you mentioned.
Unfortunately this commit can't just be reverted in upstream. I'll share a proposed patch and if someone can try it that would be really helpful.
Launchpad user Brett Holman(holmanb) wrote on 2022-07-19T15:47:07.478603+00:00
This is ugly, but if someone tests it (which will confirm/deny my analysis above) then we can try to push this fix (or a less ugly equivalent) into upstream.
The thrown NotImplementedError will be caught in apply_network_config() which should put gentoo back on the fallback path it was on before.
Launchpad user Brett Holman(holmanb) wrote on 2022-07-19T16:02:42.512230+00:00
diff --git a/cloudinit/distros/init.py b/cloudinit/distros/init.py index e27a3f93..1e69709d 100644 --- a/cloudinit/distros/init.py +++ b/cloudinit/distros/init.py @@ -81,6 +81,7 @@ class Distro(persistence.CloudInitPickleMixin, metaclass=abc.ABCMeta): renderer_configs: Mapping[str, Mapping[str, Any]] = {} _preferred_ntp_clients = None networking_cls: Type[Networking] = LinuxNetworking
uses_network_renderer: bool = True
shutdown_options_map = {"halt": "-H", "poweroff": "-P", "reboot": "-r"} @@ -123,7 +124,13 @@ class Distro(persistence.CloudInitPickleMixin, metaclass=abc.ABCMeta): self._cfg, ("network", "renderers"), None )
name, render_cls = renderers.select(priority=priority)
try:
name, render_cls = renderers.select(priority=priority)
except Exception as e:
if not self.uses_network_renderer:
raise NotImplementedError
else:
raise e LOG.debug( "Selected renderer '%s' from priority list: %s", name, priority ) diff --git a/cloudinit/distros/gentoo.py b/cloudinit/distros/gentoo.py index 37217fe4..ffcb9525 100644 --- a/cloudinit/distros/gentoo.py +++ b/cloudinit/distros/gentoo.py @@ -23,6 +23,7 @@ class Distro(distros.Distro): hostname_conf_fn = "/etc/conf.d/hostname" init_cmd = ["rc-service"] # init scripts default_locale = "en_US.UTF-8"
uses_network_renderer: bool = False
Launchpad user Rob Tongue(robtongue) wrote on 2022-07-20T02:25:09.660702+00:00
This got it further. It created the configuration in /etc/conf.d/net.eth0, but it was non-working. It didn't like the mac_eth0="None" that got thrown in there. I do know the configuration that is being fed to cloud-init has the proper mac, so it has to be an error in the code.
I am confirming this issue.
I am working on a tool to build cloud-init images for Gentoo and I can build both MBR and EFI ones. MBR works perfectly fine, EFI has broken cloud-init networking.
The first error I get is something like no available network renderers found unable to render networking stages.py
Then I get errors like Calling 'None' failed request error HTTTPConnectionPool Max retries exceeded with url /2009-04-04/meta-data/instance-id
, multiple entries and it takes a while before reaching the login prompt.
Please let me know if there's any information I can provide for you. If you want to reproduce my methodology, check out my repo https://github.com/NucleaPeon/gentooimgr and follow the EFI portions of the readme.
I will try applying your patch and report back.
I applied the patch but as expected, it still encounters the issues I mentioned above.
I did notice that it didn't detect my dhcpcd
install when I looked at the logs:
2024-02-02 19:32:16,709 - dhcp.py[WARNING]: DHCP client not found: dhcpcd
localhost ~ # eix dhcpcd
* acct-group/dhcpcd
Available versions: 0-r2
Description: System group: dhcpcd
* acct-user/dhcpcd
Available versions: 0-r2
Description: user for dhcpcd client
[I] net-misc/dhcpcd
Available versions: 9.5.1 10.0.3 10.0.5-r1 ~10.0.6 ~10.0.6-r1 **9999*l {debug +embedded ipv6 privsep +udev}
Installed versions: 10.0.5-r1(07:41:04 AM 02/02/2024)(embedded ipv6 udev -debug -privsep)
Homepage: https://github.com/NetworkConfiguration/dhcpcd/ https://roy.marples.name/projects/dhcpcd/
Description: A fully featured, yet light weight RFC2131 compliant DHCP client
cloud-init-output.log cloud-init.log
I installed dhclient and removed dhcpcd, but it still fails to make a connection or recognize it's installed, so I switched back to dhcpcd. If I add dhcpcd to the boot runlevel, it puts more timeout messages into the logs and takes longer to reach login than when it's at the default runlevel.
I installed networkmanager (-modemmanager -wext -wifi -bluetooth) and added it to runlevel default. I also removed dhcpcd and dhcpd services. It gives me an exception in the log file:
2024-02-02 22:31:19,714 - log.py[DEPRECATED]: DataSourceDigitalOcean is deprecated in 23.2 and scheduled to be removed in 28.2. Deprecated in favour of DataSourceConfigDrive.
2024-02-02 22:31:21,051 - DataSourceGCE.py[WARNING]: Did not find a fallback interface on gce.
2024-02-02 22:31:21,102 - DataSourceVMware.py[ERROR]: failed to find a valid data access method
2024-02-02 22:31:21,167 - networking.py[WARNING]: Not all expected physical devices present: {'00:00:00:00'}
2024-02-02 22:31:21,168 - util.py[WARNING]: failed stage init-local
failed run of stage init-local
------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python3.11/site-packages/cloudinit/cmd/main.py", line 394, in main_init
init.fetch(existing=existing)
File "/usr/lib/python3.11/site-packages/cloudinit/stages.py", line 493, in fetch
return self._get_data_source(existing=existing)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/cloudinit/stages.py", line 360, in _get_data_source
(ds, dsname) = sources.find_source(
^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/cloudinit/sources/__init__.py", line 1028, in find_source
raise DataSourceNotFoundException(msg)
cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes: (DataSourceNoCloud, DataSourceConfigDrive, DataSourceLXD, DataSourceOpenNebula, DataSourceDigitalOcean, DataSourceAzure, DataSourceOVF, DataSourceMAAS, DataSourceGCELocal, DataSourceOpenStackLocal, DataSourceAliYunLocal, DataSourceVultr, DataSourceEc2Local, DataSourceCloudSigma, DataSourceSmartOS, DataSourceScaleway, DataSourceHetzner, DataSourceIBMCloud, DataSourceOracle, DataSourceRbxCloud, DataSourceUpCloudLocal, DataSourceVMware, DataSourceNWCS, DataSourceAkamaiLocal)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.11/site-packages/cloudinit/cmd/main.py", line 781, in status_wrapper
ret = functor(name, args)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/cloudinit/cmd/main.py", line 415, in main_init
init.apply_network_config(bring_up=bring_up_interfaces)
File "/usr/lib/python3.11/site-packages/cloudinit/stages.py", line 1032, in apply_network_config
self.distro.networking.wait_for_physdevs(netcfg)
File "/usr/lib/python3.11/site-packages/cloudinit/distros/networking.py", line 169, in wait_for_physdevs
raise RuntimeError(msg)
RuntimeError: Not all expected physical devices present: {'00:00:00:00'}
------------------------------------------------------------
The stages.py[ERROR]: Unable to render networking. Network config is likely broken: No available network renderers found. Searched through list: ['eni', 'sysconfig', 'netplan', 'network-manager', 'freebsd', 'netbsd', 'openbsd', 'networkd']
may be a red herring, as it occurs in my logs on a working MBR gentoo cloud-init image.
See attached log. cloud-init-gentoo-mbr-output.log
This bug was originally filed in Launchpad as LP: #1981912
Launchpad details
Launchpad user Rob Tongue(robtongue) wrote on 2022-07-17T02:08:52.262346+00:00
It seems that cloud-init has evolved past the previous work in getting gentoo functioning on first boot. It is missing the proper renderer to configure the network on the booted machine, in this case would be "openrc".
I am not good with python enough to help, but the original gentoo script in cloudinit/distros/gentoo.py looks like has mostly what is needed to get the network configured, and needs to be adapted to a network renderer.
Putting this bug here to hopefully get some traction on this.
stages.py[ERROR]: Unable to render networking. Network config is likely broken: No available network renderers found. Searched through list: ['eni', 'sysconfig', 'netplan', 'network-manager', 'freebsd', 'netbsd', 'openbsd', 'networkd']