amazonlinux / amazon-linux-2023

Amazon Linux 2023
https://aws.amazon.com/linux/amazon-linux-2023/
Other
500 stars 37 forks source link

[Bug] - updating aws-cfn-bootstrap #728

Open daniejstriata opened 2 weeks ago

daniejstriata commented 2 weeks ago

Describe the bug When dnf is updating aws-cfn-bootstrap I see the following error:

 Running scriptlet: aws-cfn-bootstrap-2.0-29.amzn2023.noarch                                                                                                                    18/22 
  Cleanup          : aws-cfn-bootstrap-2.0-29.amzn2023.noarch                                                                                                                    18/22 
  Running scriptlet: aws-cfn-bootstrap-2.0-29.amzn2023.noarch                                                                                                                    18/22 
Failed to set unit properties on aws-cfn-bootstrap.service: Unit aws-cfn-bootstrap.service not found.

To Reproduce Steps to reproduce the behavior:

  1. update to 2023.4.20240611
  2. observe aws-cfn-bootstrap update
elsaco commented 2 weeks ago

The aws-cfn-bootstrap.service is created by cfn-init so if no cloud formation was done the unit file is missing. The RPM provides scripts for package upgrade and uninstall only. Run rpm -q --scripts aws-cfn-bootstrap for details.

On package upgrade it runs:

if [ $1 -ge 1 ] && [ -x "/usr/lib/systemd/systemd-update-helper" ]; then
    # Package upgrade, not uninstall
    /usr/lib/systemd/systemd-update-helper mark-restart-system-units aws-cfn-bootstrap.service || :
fi

thus Failed to set unit properties warning.

margussipria commented 1 week ago

Got this same problem, did't even notice it right away.

With next metadata update cfn got stuck with retry loop, not even running next steps.

Would have been bad to notice this in production.

stewartsmith commented 4 days ago

I've reached out to the relevant internal team about the issue.

LordAlfredo commented 4 days ago

It sounds like there's two things being reported.

margussipria commented 3 days ago

we have had this code basically 8 years, from Amazon Linux 1: cloudformation.template.txt

example of that is still up in example document. https://s3.amazonaws.com/cloudformation-templates-us-east-1/LAMP_Single_Instance.template (json)

It has worked 8+ years, but now YumUpdate parameter is --releasever for yum. But this version broke everything.

  1. Loop happened because it tried still to run this part, while cfn-hup was missing.
          services:
            sysvinit:
              cfn-hup:
                enabled: true
                ensureRunning: true
                files:
                  - /etc/cfn/cfn-hup.conf
                  - /etc/cfn/hooks.d/cfn-auto-reloader.conf
                  - /etc/cfn/hooks.d/update.conf

    when removing those lines updates worked worked until first boot.

  2. after boot updating instance metadata has no effect.
margussipria commented 3 days ago

basically

$ systemctl enable cfn-hup
Failed to enable unit: Unit file cfn-hup.service does not exist.

$ yum downgrade aws-cfn-bootstrap-2.0-29.amzn2023.noarch
# download log
$ systemctl enable cfn-hup
$ echo $?
0

$ yum update -y
# download log
$ systemctl enable cfn-hup
Failed to enable unit: Unit file cfn-hup.service does not exist.
$ echo $?
1
elsaco commented 3 days ago

cfn-hup unit is present in aws-cfn-bootstrap-2.0-30.amzn2023.noarch package:

[ec2-user@i-0c44ccb32089a355e ~]$ rpm -qf /opt/aws/apitools/cfn-init/init/systemd/cfn-hup.service
aws-cfn-bootstrap-2.0-30.amzn2023.noarch
[ec2-user@i-0c44ccb32089a355e ~]$ systemctl status --no-pager cfn-hup
○ cfn-hup.service - cfn-hup daemon
     Loaded: loaded (/usr/lib/systemd/system/cfn-hup.service; disabled; preset: disabled)
     Active: inactive (dead)

Jun 27 15:46:38 i-0c44ccb32089a355e.ec2.internal systemd[1]: /usr/lib/systemd/system/cfn-hup.service:6: PIDFile= references a path below legacy directory /var/run/, updating /var/run/cfn-hup.pid → /run/cfn-hup.pid; please update the unit file accordingly.
Jun 27 15:55:15 i-0c44ccb32089a355e.ec2.internal systemd[1]: /usr/lib/systemd/system/cfn-hup.service:6: PIDFile= references a path below legacy directory /var/run/, updating /var/run/cfn-hup.pid → /run/cfn-hup.pid; please update the unit file accordingly.
Jun 27 15:55:23 i-0c44ccb32089a355e.ec2.internal systemd[1]: /usr/lib/systemd/system/cfn-hup.service:6: PIDFile= references a path below legacy directory /var/run/, updating /var/run/cfn-hup.pid → /run/cfn-hup.pid; please update the unit file accordingly.

It is also present in aws-cfn-bootstrap-2.0-29 package:

[ec2-user@i-0c44ccb32089a355e ~]$ rpm -qpl aws-cfn-bootstrap-2.0-29.amzn2023.noarch.rpm  | grep cfn-hup
/opt/aws/apitools/cfn-init-2.0-29/bin/cfn-hup
/opt/aws/apitools/cfn-init-2.0-29/init/redhat/cfn-hup
/opt/aws/apitools/cfn-init-2.0-29/init/systemd/cfn-hup.service
/opt/aws/apitools/cfn-init-2.0-29/init/ubuntu/cfn-hup
/opt/aws/bin/cfn-hup
/usr/bin/cfn-hup
/usr/lib/systemd/system/cfn-hup.service
LordAlfredo commented 3 days ago

Correct - I'm conferring with the package maintainer if cfn-hup.service was their intended systemd target in the package since the pre/post scriptlets currently point to aws-cfn-bootstrap.service. The latter isn't defined anywhere, hence the warning message.

Margus's error appears to be a result of version upgrade from 2.0.29 to 2.0.30 removing cfn-hup entirely. We will investigate further.

margussipria commented 3 days ago
$ uname -a
Linux host-test1.localdomain 6.1.92-99.174.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Jun  4 15:43:46 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

original ami used for this machine was ami-04fe22dfadec6f0b6 (eu-west-1) every ami for Amazon Linux 2023 has worked, also with upgrades, except these upgrades:

$ history | grep release
    4  dnf upgrade --releasever=2023.4.20240611
   36  dnf upgrade --releasever=2023.5.20240624
[root@(host) ~]$ rpm -qf /opt/aws/apitools/cfn-init/init/systemd/cfn-hup.service
aws-cfn-bootstrap-2.0-30.amzn2023.noarch
[root@(host) ~]$ systemctl status --no-pager cfn-hup
Unit cfn-hup.service could not be found.

and with downgrade:

[root@(host) ~]$ rpm -qf /opt/aws/apitools/cfn-init/init/systemd/cfn-hup.service
aws-cfn-bootstrap-2.0-29.amzn2023.noarch
[root@(host) ~]$ systemctl status --no-pager cfn-hup
○ cfn-hup.service - cfn-hup daemon
     Loaded: loaded (/etc/systemd/system/cfn-hup.service; enabled; preset: disabled)
     Active: inactive (dead)

Jun 27 09:24:24 host-test1.localdomain systemd[1]: cfn-hup.service: Failed to open /etc/systemd/system/cfn-hup.service: No such file or directory
Jun 27 10:19:24 host-test1.localdomain systemd[1]: cfn-hup.service: Failed to open /etc/systemd/system/cfn-hup.service: No such file or directory
Jun 27 10:19:25 host-test1.localdomain systemd[1]: cfn-hup.service: Failed to open /etc/systemd/system/cfn-hup.service: No such file or directory
Jun 27 19:41:30 host-test1.localdomain systemd[1]: /etc/systemd/system/cfn-hup.service:6: PIDFile= references a path below legacy directory /var/run/, updating /var/run/cfn-hup.pid → /run/cfn-hup.pid; please update the unit file accordingly.