coreos / bugs

Issue tracker for CoreOS Container Linux
https://coreos.com/os/eol/
147 stars 30 forks source link

Bug in Update Strategy Docs #2302

Closed hapnermw closed 6 years ago

hapnermw commented 6 years ago

Issue Report

On page https://coreos.com/os/docs/latest/update-strategies.html

The ignition example source property for the /etc/coreos/update.conf file is wrong.

There are two issues:

  1. This example deletes the GROUP attribute of the file
  2. It puts the REBOOT_STRATEGY value is double quotes - it should not be quoted

A valid example of source would be:

"data:,GROUP%3Dstable%0D%0AREBOOT_STRATEGY%3Dreboot"

bgilbert commented 6 years ago
  1. If you want to specify a custom update group, you can use the group key in the locksmith section of your Container Linux Config. Otherwise, the default group from /usr/share/coreos/update.conf will be used. Starting with Container Linux 1367.5.0, the update group in /usr/share/coreos/update.conf always corresponds to the correct channel for a given release (alpha, beta, or stable), so there's usually no need to specify an update group in /etc/coreos/update.conf.
  2. REBOOT_STRATEGY is only used by locksmithd. locksmithd's configuration is read via a systemd EnvironmentFile directive, and EnvironmentFile allows values to be quoted.

Are you experiencing incorrect behavior due to the generated update.conf?

hapnermw commented 6 years ago

Hi Benjamin,

Yes, I am experiencing what looks to be incorrect behavior.

This EC2 instance is saying its reboot strategy is off instead of reboot as specified in /etc/coreos/update.conf. Also “Linux” is misspelled.

(1520.8.0)inux by CoreOS stable Update Strategy: No Reboots core@ip-10-0-6-204 ~ $ cat /etc/coreos/update.conf GROUP=stable REBOOT_STRATEGY=reboot core@ip-10-0-6-204 ~ $

Background

An instance was originally booted with the following Ignition descriptor in User Data:

{"ignition":{"version":"2.1.0"},"storage":{"files":[{"filesystem":"root","path":"/etc/coreos/update.conf","contents":{"source":"data:,GROUP%3Dstable%0D%0AREBOOT_STRATEGY%3Dreboot%0D%0A"},"mode":420,"user":{},"group":{}}]}}

The instance was then stopped and an AMI of it was created. This AMI was then used to create an instance with User Data set to feed data to one of its enabled services.

The result is the output above. The instance’s update.conf is set to reboot; however, CoreOS is reporting the off strategy.

bgilbert commented 6 years ago

Update Strategy: No Reboots can also mean that locksmithd is not running. What does systemctl status locksmithd.service say?

hapnermw commented 6 years ago

(1520.8.0)inux by CoreOS stable Update Strategy: No Reboots core@ip-10-0-6-25 ~ $ systemctl status locksmithd ● locksmithd.service - Cluster reboot manager Loaded: loaded (/usr/lib/systemd/system/locksmithd.service; disabled; vendor preset: disabled) Active: active (running) since Mon 2017-12-25 07:16:49 UTC; 24s ago Main PID: 748 (locksmithd) Tasks: 5 (limit: 32768) Memory: 1.5M (limit: 32.0M) CPU: 6ms CGroup: /system.slice/locksmithd.service └─748 /usr/lib/locksmith/locksmithd

Dec 25 07:16:49 ip-10-0-6-25 systemd[1]: locksmithd.service: Service hold-off time over, scheduling restart. Dec 25 07:16:49 ip-10-0-6-25 systemd[1]: Stopped Cluster reboot manager. Dec 25 07:16:49 ip-10-0-6-25 systemd[1]: Started Cluster reboot manager. Dec 25 07:16:49 ip-10-0-6-25 locksmithd[748]: No configured reboot window Dec 25 07:16:49 ip-10-0-6-25 locksmithd[748]: locksmithd starting currentOperation="UPDATE_STATUS_IDLE" strategy="reboot" core@ip-10-0-6-25 ~ $

bgilbert commented 6 years ago

From your logs, this is likely due to https://github.com/coreos/bugs/issues/2179, which was present in 1520.8.0. locksmithd would sometimes fail to start the first time, and then when systemd restarted it 10 seconds later, it would start successfully. If you logged in before then, you would see the Update Strategy: No Reboots message.

This should be fixed in Container Linux 1576.0.0, so I'll close. If you're still seeing this problem with current releases of Container Linux, please open a new bug.