canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.99k stars 881 forks source link

cloud-init will not run user-data scripts when /var filesystem is mounted with the noexec flag #3429

Open ubuntu-server-builder opened 1 year ago

ubuntu-server-builder commented 1 year ago

This bug was originally filed in Launchpad as LP: #1839899

Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = None
date_created = 2019-08-13T00:57:41.794994+00:00
date_fix_committed = None
date_fix_released = None
id = 1839899
importance = medium
is_complete = False
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1839899
milestone = None
owner = nemusupport
owner_name = Nému Support
private = False
status = triaged
submitter = nemusupport
submitter_name = Nému Support
tags = ['aws', 'rhel', 'selinux']
duplicates = []

Launchpad user Nému Support(nemusupport) wrote on 2019-08-13T00:57:41.794994+00:00

Cloud Vendor: Amazon AWS Platform: RHEL7.6 Cloud-Init: cloud-init-18.5-3.el7.x86_64 Kernel: 3.10.0-1062.el7.x86_64 SELinux: selinux-policy-targeted-3.13.1-252.el7.1.noarch

--

We have identified that having the "noexec" flag set on the /var filesystem causes cloud-init to fail running user-data scripts. This is a security requirement mandated by STIG policies that we're purposefully trying to meet for Federal systems.

The affected code is in:

/usr/lib/python2.7/site-packages/cloudinit/util.py

Under the function:

runparts()

The system checks for access to the executable using the following line:

        if os.path.isfile(exe_path) and os.access(exe_path, os.X_OK):                                     ## ^^^^^^^^^

While the file is executable, the "noexec" flag on the filesystem causes os.access() to report False, which cancels the execution of the user-data script.

To reproduce the problem:

Note that the files in /var/lib/cloud/instances/*/scripts/ are executable (mode 0755 or 0700)

And that when trying to execute the file, you will get Error 13: Permission denied.

--

Possible fixes:

We have tested the second workaround and it seems to help:

cloud-init clean

rm -Rf /var/lib/cloud

mkdir -p /etc/cloud/runtime

ln -s /etc/cloud/runtime /var/lib/cloud

restorecon -rv /var/lib/cloud

After this, user-data scripts appear to execute.

ubuntu-server-builder commented 1 year ago

Launchpad user Nému Support(nemusupport) wrote on 2019-08-13T12:00:30.386094+00:00

Update:

Further testing leads us to believe that this problem may actually occur when having the "noexec" flag set on the /var filesystem. This is a security requirement that we're purposefully trying to meet for Federal systems.

Possible fixes:

We have tested the second item and it seems to work:

cloud-init clean

rm -Rf /var/lib/cloud

mkdir -p /etc/cloud-init-runtime

ln -s /etc/cloud-init-runtime /var/lib/cloud

restorecon -rv /var/lib/cloud

After this, user-data scripts appear to execute.

ubuntu-server-builder commented 1 year ago

Launchpad user Nému Support(nemusupport) wrote on 2019-08-13T12:26:04.839233+00:00

Confirmed that moving the directory to /etc works.

Not sure if there's a clean way to fix this in cloud-init's code - Should the software detect that the /var/lib/cloud directory is on a noexec filesystem and change the storage path for executable scripts to /etc/cloud/runtime-scripts in such cases?

ubuntu-server-builder commented 1 year ago

Launchpad user Dan Watkins(oddbloke) wrote on 2019-08-13T14:49:46.279918+00:00

Hi Nému!

Thanks for using cloud-init, and for filing this detailed bug. It's great!

Regarding your first possible fix, my feeling is that we can't assume that the files that runparts is executing are scripts with shebangs. For example, I just did ln /bin/ls /var/lib/cloud/scripts/per-boot (in an Ubuntu lxd container) and cloud-init happily runs it, outputting the contents of / to /var/log/cloud-init-output.log. I don't think we should break this binary-in-scripts-directory usecase.

Given that I think you've discovered that this issue is slightly different to your initial report, could you update the description to reflect your latest understanding of it, and then move this report back to New? That will make it easier for me to take this bug to the rest of the development team for a conversation.

Thanks!

Dan

ubuntu-server-builder commented 1 year ago

Launchpad user Nému Support(nemusupport) wrote on 2019-08-13T15:22:21.843918+00:00

Thanks for the reply, Dan! To confirm, if you remount your /var filesystem as noexec under the lxc container, your binary no longer gets executed?

ubuntu-server-builder commented 1 year ago

Launchpad user Dan Watkins(oddbloke) wrote on 2019-08-14T13:38:14.808926+00:00

Thanks for the title update! I'd appreciate it if we could also update the longer-form text to match the bug as we now understand it, so that people don't have to read through comments to work out where we're at.

To confirm, if you remount your /var filesystem as noexec under the lxc container, your binary no longer gets executed?

/var is part of the root partition in the Ubuntu lxd images, so I don't really have an easy way to test that, unfortunately.

Thanks!

Dan

ubuntu-server-builder commented 1 year ago

Launchpad user Dan Watkins(oddbloke) wrote on 2019-08-20T13:53:12.182772+00:00

Hi Nému, thanks for the update, it looks good! It looks like we understand the problem pretty well, so I've moved it to Triaged. Am I right in thinking that this is something that you're looking to work on?

ubuntu-server-builder commented 1 year ago

Launchpad user C de-Avillez(hggdh2) wrote on 2022-02-22T16:59:31.464016+00:00

This is actually a problem whenever the system is installed with /var in its own filesystem, and set 'noexec'. Although the default deployment on Ubuntu is with /var under the root filesystem, this is may not be the case on STIG-hardened installs, across distributions.

In our case, we see Azure being hit by this as well. All that is needed is:

Please not that this also affects running of /var/tmp/dhclient on startup.

ubuntu-server-builder commented 1 year ago

Launchpad user Chad Smith(chad.smith) wrote on 2022-06-01T20:31:21.221183+00:00

Here is the corresponding dhclient bug related to this noexec issue in /var/tmp. https://bugs.launchpad.net/cloud-init/+bug/1962343

ubuntu-server-builder commented 1 year ago

Launchpad user Chad Smith(chad.smith) wrote on 2022-06-01T23:05:09.327326+00:00

Per the suggestion/request in comment #2

Should the software detect that the /var/lib/cloud directory is on a noexec filesystem and change the storage path for executable scripts to /etc/cloud/runtime-scripts in such cases?

I suggest that cloud-init shouldn't attempt to automatically stuff executable binaries somewhere under /etc as the filesystem heirarchy standard tells us we shouldn't stuff binaries there https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s07.html#requirements3.

Instead I think images which need noexec/var filesystems out of the box should probably update the base cloud configuration in /etc/cloud/cloud.cfg by providing a config snippet in something like /etc/cloud/cloud.cfg.d/95-custom_cloud_dir.cfg: system_info: paths: cloud_dir: /some/dir/on/a/filesystem/without/noexec.

Some thoughts that come to mind could be /usr/lib/cloud-init/cloud or /usr/libexec/cloud-init/cloud depending on your Linux distribution.

We may be taking this /usr/lib*/ approach for the issues affecting /var/tmp/cloud-init/dhclient runs for LP: #1962343

ubuntu-server-builder commented 1 year ago

Launchpad user Chad Smith(chad.smith) wrote on 2022-06-01T23:27:12.858180+00:00

Discovered today LP: #1976564 that cloud_dir doesn't seem to be honored everywhere so there will be a couple of corner cases where setting cloud_dir won't work at the moment, but we can resolve that bug shortly.

ubuntu-server-builder commented 1 year ago

Launchpad user Alberto Contreras(aciba) wrote on 2022-06-23T07:07:33.789599+00:00

Fix committed solving #1976564