canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.87k stars 856 forks source link

BUG - when installing any packages that depend on packagekit during bootcmd, process hangs on bootcmd, even if DEBIAN_FRONTEND is set to non-interactive #5608

Open areis422 opened 1 month ago

areis422 commented 1 month ago

Bug report

When installing any packages that depend on packagekit during bootcmd, process hangs on bootcmd, even if DEBIAN_FRONTEND is set to noninteractive

Steps to reproduce the problem

#cloud-config
bootcmd:
  - [ cloud-init-per, instance, removeResolveConf, rm, /etc/resolv.conf ]
  - [ cloud-init-per, instance, relinkResolveConf, ln, -s, /run/systemd/resolve/resolv.conf, /etc/resolv.conf ]
  - [ cloud-init-per, instance, aptUpdate, apt-get, update, -qyy ]
  - [ cloud-init-per, instance, aptPurgeNeedRestart, apt-get, autoremove, -qyy, --purge, needrestart ]
  - [ cloud-init-per, instance, aptInstallAptUtils, bash, -c ,"export DEBIAN_FRONTEND=noninteractive; apt-get install --no-install-recommends -qyy apt-utils dialog git ca-certificates cargo landscape-common libssl-dev pkg-config **software-properties-common**;" ]

Adding "echo \"FRONTEND=${DEBIAN_FRONTEND}\"" ends up printing noninteractive, so the environment variable is being exported correctly.

Example Hang

...
...
[   55.371121] cloud-init[482]: Setting up cargo (1.75.0+dfsg0ubuntu1-0ubuntu7) ...
[   55.407109] cloud-init[482]: Processing triggers for libc-bin (2.39-0ubuntu8.2) ...
[   55.445129] cloud-init[482]: Processing triggers for dbus (1.14.10-4ubuntu4) ...
[   55.479121] cloud-init[482]: Processing triggers for sgml-base (1.31) ...
[   55.511148] cloud-init[482]: Setting up polkitd (124-2ubuntu1) ...
[   55.543159] cloud-init[482]: Creating group 'polkitd' with GID 991.
[   55.573162] cloud-init[482]: Creating user 'polkitd' (User for polkitd) with UID 991 and GID 991.
[   55.614173] cloud-init[482]: invoke-rc.d: could not determine current runlevel
[   55.752612] cloud-init[482]: Setting up packagekit (1.2.8-2build3) ...
[   55.784170] cloud-init[482]: invoke-rc.d: could not determine current runlevel
### NOTHING BEYOND THIS POINT ###

Environment details

cloud-init logs

unable to capture logs since boot never finishes, but can be reproduced every time. The reason for including it in bootcmd, is otherwise the apt sources via ppas don't get ingested correctly. I also use the Ubuntu minimal images from AWS. It DOES NOT hang when software-properties-common is defined in the package list, but at that point, it's too late. 24.04 minimal 22.04 minimal

blackboxsw commented 1 month ago

Thank you @areis422 for filing this bug and making cloud-init better.

I was able to reproduce a similar behavior using LXD and the following script and it looks like LXD w. ubuntu-minimal images was able to succeed, with either software-common-properties and just polkitd as you mentioned.

cat> issue-5608.yaml <<EOF
#cloud-config
bootcmd:
  - [ cloud-init-per, instance, removeResolveConf, rm, /etc/resolv.conf ]
  - [ cloud-init-per, instance, relinkResolveConf, ln, -s, /run/systemd/resolve/resolv.conf, /etc/resolv.conf ]
  - [ cloud-init-per, instance, aptUpdate, apt-get, update, -qyy ]
  - [ cloud-init-per, instance, aptPurgeNeedRestart, apt-get, autoremove, -qyy, --purge, needrestart ]
  - [ cloud-init-per, instance, aptInstallAptUtils, bash, -c ,"export DEBIAN_FRONTEND=noninteractive; apt-get install --no-install-recommends -qyy apt-utils dialog git ca-certificates cargo landscape-common libssl-dev pkg-config **polkitd**;" ]`
EOF
lxc launch ubuntu-minimal:noble issue-5608 -c cloud-init.user-data="$(cat 5608.yaml)"
lxc exec issue-5608 -- cloud-init status --wait --format=yaml
lxc exec issue-5608 -- cloud-init analyze show
lxc exec issue-5608 -- grep bootcmd /var/log/cloud-init.log

unable to capture logs since boot never finishes, but can be reproduced every time. The reason for including it in bootcmd, is otherwise the apt sources via ppas don't get ingested correctly.

Can you expand on this failure symptom you are seeing so I can better understand the use-case here. What apt sources config is being provided in user-data that breaks for you? The reason I ask as well is because generally we'd guide people to use the top-level packages: key for manipulating/installing removing packages on the system instead of trying to run apt operations directly in bootcmd or runcmd. I also confirmed that invoking these commands directly in runcmd: succeeds without blocking.

I'm guessing this issue is directly related to polkitd.postinst calling invoke-rc.d dbus reload || true which could be blocking due to dbus not being ready yet that early in boot because dbus.service doesn't start until After=sysinit.target which is After=cloud-init.service which is the systemd unit that runs the bootcmd module you are trying to use. So basically you have a systemd service deadlock in this case.

I also see the following blocked process in the process table confirming the suspicion that dbus isn't ready yet and your bootcmd is blocking on that. /bin/sh /usr/sbin/invoke-rc.d dbus force-reload

So, let's see if we can get to the reasons behind why the typical packages: directives don't meet your needs with any apt sources that need to be configured via user-data to understand if there is an alternative to bootcmd that will meet your needs.

blackboxsw commented 1 month ago

Marking incomplete, just to represent we're looking for a bit more context/feedback here as this doesn't feel like an actionable cloud-init bug (yet) as we generally shouldn't be trying to perform package operations in bootcmd if they can be avoided (and the breaking issue is a package postinst interaction with dbus that cloud-init doesn't control). Any additional context on how user-data is broken relative to apt sources and desired packages installed would be helpful to clarify whether there are alternatives that can meet your needs.

areis422 commented 3 weeks ago

This can probably be closed. it seems like if apt->sources is defined, it auto-installs software-packages-common and PackageKit without intervention. That being said, what about defining custom sources or ppas and apt not updating before running package installs?