coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
264 stars 59 forks source link

File ownership /etc & /var randomly get changed to 33:root after a few minutes of first boot. #1563

Closed haleksandre closed 1 year ago

haleksandre commented 1 year ago

Describe the bug

I am able to install the CoreOS with the following butane configuration file:

Butane:

variant: fcos
version: 1.5.0
passwd:
  users:
    - name: core
      groups:
        - sudo
        - docker
      password_hash: # ....hash
      ssh_authorized_keys:
        # ...keys
      home_dir: /home/core
storage:
  disks:
    - device: /dev/disk/by-id/coreos-boot-disk
      wipe_table: false
      partitions:
        - label: root
          number: 4
          # Allocate at least 8 GiB to the rootfs. See NOTE above about this.
          size_mib: 10240
          resize: true
        - label: swap
          # Allocate 16 GiB to swap
          start_mib: 0
          size_mib: 16384
          resize: true
        - label: var
          start_mib: 0
          size_mib: 0
  filesystems:
    # ....filesystems
  directories:
    - path: /etc/nginx/conf.d
      mode: 0755
      user:
        name: core
      group:
        name: core
      overwrite: true
    # ...directories
  files:
    - path: /etc/nginx/conf.d/default.conf
      mode: 0664
      user:
        name: core
      group:
        name: core
      contents: 
        local: # ...local content
    - path: /etc/hostname
      mode: 0644
      contents:
        inline: media
    - path: /etc/selinux/config
      mode: 0644
      overwrite: true
      contents:
        inline: |
          # This file controls the state of SELinux on the system.
          # SELINUX= can take one of these three values:
          #     enforcing - SELinux security policy is enforced.
          #     permissive - SELinux prints warnings instead of enforcing.
          #     disabled - No SELinux policy is loaded.
          # See also:
          # https://docs.fedoraproject.org/en-US/quick-docs/getting-started-with-selinux/#getting-started-with-selinux-selinux-states-and-modes
          #
          # NOTE: In earlier Fedora kernel builds, SELINUX=disabled would also
          # fully disable SELinux during boot. If you need a system with SELinux
          # fully disabled instead of SELinux running with no policy loaded, you
          # need to pass selinux=0 to the kernel command line. You can use grubby
          # to persistently set the bootloader to boot with selinux=0:
          #
          #    grubby --update-kernel ALL --args selinux=0
          #
          # To revert back to SELinux enabled:
          #
          #    grubby --update-kernel ALL --remove-args selinux
          #
          SELINUX=permissive
          # SELINUXTYPE= can take one of these three values:
          #     targeted - Targeted processes are protected,
          #     minimum - Modification of targeted policy. Only selected processes are protected.
          #     mls - Multi Level Security protection.
          SELINUXTYPE=targeted
    # ...files
    - path: /etc/containers/systemd/server.network
      contents:
        inline: |
          [Network]
          Label=lan.media.Network=server.network
    - path: /etc/containers/systemd/nginx.volume
      contents:
        inline: |
          [Volume]
          Label=lan.media.volume=nginx
    - path: /etc/containers/systemd/nginx.container
      contents:
        inline: |
          [Unit]
          Description=Nginx Quadlet
          Requires=podman.socket
          After=podman.socket

          [Container]
          Image=docker.io/nginxproxy/nginx-proxy:alpine
          ContainerName=nginx
          AutoUpdate=registry
          Environment=DOCKER_HOST=unix://${XDG_RUNTIME_DIR}/podman/podman.sock
          Network=server.network
          PublishPort=80:80/tcp
          PublishPort=443:443/tcp
          Volume=${XDG_RUNTIME_DIR}/podman/podman.sock:/tmp/docker.sock:ro
          Volume=letsencrypt.volume:/etc/letsencrypt
          Volume=nginx.volume:/usr/share/nginx/html
          Volume=/etc/nginx/conf.d/:/etc/nginx/conf.d/
          Volume=/etc/nginx/global/:/etc/nginx/global/
          Volume=/etc/nginx/ssl/:/etc/nginx/ssl/

          [Service]
          Restart=always
          TimeoutStartSec=900

          [Install]
          WantedBy=multi-user.targe
      # ...other quadlet
systems:
  units:
     # ...units

On first boot, for a little while at least, everything seems to be running correctly, files, directories, quadlet get generated & installed. I can login/logout, I can SSH in/out. But after a few minutes, something happens & the directories & subdirectories ownership of /etc & /var get changed to 33:root. This then causes the sudo errors below, which seem to be 'unfixable' as I lose sudo privileges.

/etc/sudo.conf is owned by uid 33, should be 0
sudo: /etc/sudo.conf is owned by uid 33, should be 0
sudo: /etc/sudoers is owned by uid 33, should be 0
sudo: no valid sudoers sources found, quitting
sudo: error initializing audit plugin sudoers_audit

Also, after this I can no longer SSH into the machine as I hit 'permission denied' because the /home/core (/var/home/core) is now set with 33:root permissions.

Reproduction steps

  1. Bootup CoreOS live usb
  2. Install via coreos-install install /dev/sda -I url-to-ignition-config command
  3. Boot up machine
  4. Login to core user
  5. Wait a few minutes

Expected behavior

root should retain ownership of /etc & /var, except for the directories/subdirectories & files present within the butane config.

Actual behavior

ownership of /etc & /var get changed, in this case to 33:root, after a few minute post first boot.

System details

Butane or Ignition config

variant: fcos
version: 1.5.0
passwd:
  users:
    - name: core
      groups:
        - sudo
        - docker
      # password_hash: # ....hash
      # ssh_authorized_keys:
        # ...keys
      home_dir: /home/core
storage:
  disks:
    - device: /dev/disk/by-id/coreos-boot-disk
      wipe_table: false
      partitions:
        - label: root
          number: 4
          # Allocate at least 8 GiB to the rootfs. See NOTE above about this.
          size_mib: 10240
          resize: true
        - label: swap
          # Allocate 16 GiB to swap
          start_mib: 0
          size_mib: 16384
          resize: true
        - label: var
          start_mib: 0
          size_mib: 0
  filesystems:
    - device: /dev/disk/by-partlabel/swap
      format: swap
      wipe_filesystem: true
      with_mount_unit: true
    - device: /dev/disk/by-partlabel/var
      path: /var
      format: xfs
      with_mount_unit: true
    # ....filesystems
  directories:
    - path: /etc/nginx/conf.d
      mode: 0755
      user:
        name: core
      group:
        name: core
      overwrite: true
    # ...directories
  files:
    - path: /etc/nginx/conf.d/default.conf
      mode: 0664
      user:
        name: core
      group:
        name: core
      contents: 
        local: # ...local content
    - path: /etc/hostname
      mode: 0644
      contents:
        inline: media
    - path: /etc/selinux/config
      mode: 0644
      overwrite: true
      contents:
        inline: |
          # This file controls the state of SELinux on the system.
          # SELINUX= can take one of these three values:
          #     enforcing - SELinux security policy is enforced.
          #     permissive - SELinux prints warnings instead of enforcing.
          #     disabled - No SELinux policy is loaded.
          # See also:
          # https://docs.fedoraproject.org/en-US/quick-docs/getting-started-with-selinux/#getting-started-with-selinux-selinux-states-and-modes
          #
          # NOTE: In earlier Fedora kernel builds, SELINUX=disabled would also
          # fully disable SELinux during boot. If you need a system with SELinux
          # fully disabled instead of SELinux running with no policy loaded, you
          # need to pass selinux=0 to the kernel command line. You can use grubby
          # to persistently set the bootloader to boot with selinux=0:
          #
          #    grubby --update-kernel ALL --args selinux=0
          #
          # To revert back to SELinux enabled:
          #
          #    grubby --update-kernel ALL --remove-args selinux
          #
          SELINUX=permissive
          # SELINUXTYPE= can take one of these three values:
          #     targeted - Targeted processes are protected,
          #     minimum - Modification of targeted policy. Only selected processes are protected.
          #     mls - Multi Level Security protection.
          SELINUXTYPE=targeted
    # ...files
    - path: /etc/containers/systemd/server.network
      contents:
        inline: |
          [Network]
          Label=lan.media.Network=server.network
    - path: /etc/containers/systemd/nginx.volume
      contents:
        inline: |
          [Volume]
          Label=lan.media.volume=nginx
    - path: /etc/containers/systemd/nginx.container
      contents:
        inline: |
          [Unit]
          Description=Nginx Quadlet
          Requires=podman.socket
          After=podman.socket

          [Container]
          Image=docker.io/nginxproxy/nginx-proxy:alpine
          ContainerName=nginx
          AutoUpdate=registry
          Environment=DOCKER_HOST=unix://${XDG_RUNTIME_DIR}/podman/podman.sock
          Network=server.network
          PublishPort=80:80/tcp
          PublishPort=443:443/tcp
          Volume=${XDG_RUNTIME_DIR}/podman/podman.sock:/tmp/docker.sock:ro
          Volume=letsencrypt.volume:/etc/letsencrypt
          Volume=nginx.volume:/usr/share/nginx/html
          Volume=/etc/nginx/conf.d/:/etc/nginx/conf.d/
          Volume=/etc/nginx/global/:/etc/nginx/global/
          Volume=/etc/nginx/ssl/:/etc/nginx/ssl/

          [Service]
          Restart=always
          TimeoutStartSec=900

          [Install]
          WantedBy=multi-user.target
      # ...other quadlet
# systems:
#   units:
     # ...units

Additional information

No response

travier commented 1 year ago

From your config, I would recommend that you keep SELinux in enforcing and run the container that needs access to the podman socket without SELinux confinement with --security-opt label=disabled.

travier commented 1 year ago

We don't have a UID set to 33 by default in FCOS as far as I can see so this likely come from a container messing with your files.

haleksandre commented 1 year ago

This is what I suspected. I had forgotten that I quickly made an OwnCloud quadlet that mounted the /etc & /var, so I did not take the time to properly configure the container. This container is most likely the culprit. Thanks!