cgwalters commented 8 years ago

ostree and toplevel directories

ostree sets the immutable bit (chattr +i on /); the rationale is to keep all state in /etc and /var. However, some use cases have needs for toplevel mount directores for compatibility.

Today, a workaround is to do chattr -i / as part of a systemd unit for example on early boot:

[Unit]
DefaultDependencies=no
After=local-fs-pre.target
[Service]
ExecStart=chattr -i /
Type=oneshot
RemainAfterExit=yes

(Then be sure to order any .mount units after this unit)

This is a subset of https://github.com/projectatomic/rpm-ostree/issues/233

For this, all we need to do is take new empty toplevel directories from the RPM content so that one can use them as mount points.

dustymabe commented 7 years ago

@cgwalters would this help the vagrant use case where I want to create an empty toplevel directory /vagrant/ and mount a network filesystem to that location?

cgwalters commented 7 years ago

Yes.

cgwalters commented 7 years ago

Another option is to support specifying them in the treefile, but that gets ugly.

dustymabe commented 7 years ago

Not entirely the same thing as your original description and probably harder to pull off, but....

it would be nice to be able to dynamically create a top level mountpoint that can be used on an atomic host. one example use case for this if vagrant-sshfs where the user can specify where they want the mountpoint to be inside the vagrant box. I tend to use /sharedfolder/ a lot.

Would be acceptable if i first had to run a command to enable the top level empty mount point, before attempting to mount it.

cgwalters commented 7 years ago

Today one can chattr -i /. Though a tricky thing I suspect is that since vagrant-sshfs runs before provisioning, that logic would have to be part of vagrant-sshfs itself.

That said...it's worth backing up and looking at the rationale for the ostree design here; basically, as a system administrator, you can know that all of the system state is underneath /etc and /var. You have just two places to back up. Backup systems don't have to e.g. traverse / but exclude /proc and /sys, etc.

On the flip side, the rationale is weaker if you have network mounts involved - you probably don't want to backup remote mounts.

One thing to note with the chattr approach is that the directory won't persist across upgrades. But on the other hand if you just mkdir it on boot and remount it, that's not a problem. Basically, toplevel network mounts are OK.

dustymabe commented 7 years ago

are there any unknown risks involved with using chattr -i / on atomic host? Would there be any problem with me adding that as a feature in vagrant-sshfs? something like:

if want to mount a directory under `/`; then
    chattr -i /

cgwalters commented 7 years ago

You're asking about known unknowns I guess? :smile: Not aware of any offhand. The only ting I can think of is it will open other processes to create stuff there too, but...eh. As far as implementation; I would do something like:

if want mount in /; then
   if test -f /run/ostree-booted; then
     chattr -i / || true
  fi
fi

dustymabe commented 7 years ago

One thing to note with the chattr approach is that the directory won't persist across upgrades.

just realized this.. would it be worth it to have a config file where a user can configure empty top level mounts and a systemd unit that creates these mounts (defined in the config file) on boot?

redbaron commented 5 years ago

Another usecase is Nix package manager which requires /nix mountpoint

HeikoOnnebrink commented 4 years ago

Also stepped into same issue and reported as https://lists.fedoraproject.org/archives/list/coreos@lists.fedoraproject.org/thread/F2D2TMYJFRU3H24RJJNHCN3QGJPACXXU/

Once we drop in Fedora Container Linux in between all the existing CoreOS systems that have their data mount at some root folder it would cause us lots of extra work to distinguish system with mount in root folder and mount at var/lib/somefolder (background : we have lots of APIs that talk to all these systems via docker remote API and pass volume mounts to the container start calls assuming the data folder is /dockerdata)

All we need is the empty folder in root to mount disks at that place. Would it be possible to allow directories that have some ignition config attribute immutable

Like

Storage directories: — path: /dockerdata Immutable: true

and than allow to create these folders with attr +i

Would do the job for our usescase..

cgwalters commented 4 years ago

just realized this.. would it be worth it to have a config file where a user can configure empty top level mounts and a systemd unit that creates these mounts (defined in the config file) on boot?

We could do this with a fcct sugar perhaps; the main thing I think is requiring that these directories be mount points, because we don't want people to lose data.

paolope commented 4 years ago

The workaround I've implemented is creating a mount-prepare.service unit that does what @dustymabe suggests:

[Unit]
Description=Prepare mount points
Before=remote-fs-pre.target
Wants=remote-fs-pre.target

[Service]
Type=oneshot
ExecStartPre=chattr -i /
ExecStart=/bin/sh -c "[ -d '{{ mount_point }}' ] || mkdir -p '{{ mount_point }}'"
ExecStopPost=chattr +i /

[Install]
WantedBy=remote-fs.target

I guess this, and the appropriate symlink (/etc/systemd/system/remote-fs.target.wants/mount-prepare.service -> /etc/systemd/system/mount-prepare.service) could be provisioned by ignition (I'm using ansible).

They should survive reboot & Zincati upgrade.

paolope commented 4 years ago

However, I have a question (@lucab maybe can help?); suppose the mounts are in use, would Zincati/OSTree try to wipe their contents to bring the filesystem to the required state during an upgrade?

cgwalters commented 4 years ago

Zincati/OSTree try to wipe their contents to bring the filesystem to the required state during an upgrade?

EDIT: Currently...no, ostree will only delete content from non-booted deployments, which won't have it mounted. But to be extra safe, you should instead do something like this:

ExecStart=/bin/sh -c "[ -L '/{{ mount_point }}' ] || ln -sr '/var/mnt/{{ mount_point }}' '/{{ mount_point }}"

i.e. the only thing that exists in / is a symlink to /var/mnt (or something underneath /var anyways).

That way clients can also access the mount point via an OSTree-compatible path.

Or to rephrase, keep your state in /var and we're just doing this "symlink in /" for backcompat with older clients.

jorhett commented 2 years ago

is there a reason this hasn't gotten merged?

lucab commented 2 years ago

@jorhett this is a bug report and not a PR, so it can't really be "merged". Are you maybe looking at some specific PR? Which top-level missing directory concerns you? What's your usecase?

jorhett commented 2 years ago

Sorry on my choice of language. Was just trying to understand if this need has been tackled yet, or why not?

BreiteSeite commented 2 years ago

@paolope nice unit file. :) I modified it a bit to be more portable:

can be installed either manually as /etc/systemd/system/mount-prepare@.service or via systemctl edit --force --full 'mount-prepare@.service'

[Unit]
Description=Prepare mount points
Before=remote-fs-pre.target
Wants=remote-fs-pre.target

[Service]
Type=oneshot
ExecStartPre=chattr -i /
ExecStart=/bin/sh -c "[ -d '%f' ] || mkdir -p '%f'"
ExecStopPost=chattr +i /

[Install]
WantedBy=remote-fs.target

Then you can enable one service for every directory that needs to be created. For example systemctl enable mount-prepare@foo will create a /foo directory, systemctl enable mount-prepare@foo-bar would create /foo/bar

Ninja-Edit: actually, is why remote-fs is used instead of local-fs? ostree-mount.service is running Before local-fs.target so it should be fine to use local-fs.target and just declare After=ostree-remount?

Ninja-Edit 2:

So i needed that for snap. My solution now looks like this:

[pi@rpi ~]$ systemctl cat mkdir-rootfs@
# /etc/systemd/system/mkdir-rootfs@.service
[Unit]
Description=Enable mount points in / for ostree
DefaultDependencies=no
ConditionPathExists=!%f

[Service]
Type=oneshot
ExecStartPre=chattr -i /
ExecStart=mkdir -p '%f'
ExecStopPost=chattr +i /

[pi@rpi ~]$ systemctl cat snap.mount
# /etc/systemd/system/snap.mount
[Unit]
After=mkdir-rootfs@snap.service
Wants=mkdir-rootfs@snap.service
Before=snapd.socket

[Mount]
What=/var/lib/snapd/snap
Where=/snap
Options=bind
Type=none

[Install]
WantedBy=snapd.socket

Works like a charm. :-)

jlebon commented 1 year ago

Strawman: Add new sysroot.toplevel-dirs knob of type string list. Teach libostree to read this knob and create the toplevel dirs whenever a new deployment is created.

With a read-only sysroot (which is the default in FCOS, and soon will be in FSB), this can only be really useful as a mountpoint, ensuring that users aren't able to store data directly into the deployment root (which would get lost). Edit: But actually, the deployment root is currently still mounted writable but I think we could make it read-only. But probably simpler for this to just also add the immutable bit on those dirs too.

Edit: first boot would have to be special-cased since the deployment would already exist; or maybe simplest we handle this in ostree-prepare-root like we do sysroot.read-only?

jlebon commented 1 year ago

https://github.com/ostreedev/ostree/pull/2681 adds support for top-level symlinks. https://github.com/coreos/fedora-coreos-config/pull/1879 adds support for top-level symlinks on first boot configured via Ignition.

yajo commented 1 year ago

Please bear in mind that symlinks won't fix the problem.

One specific use case here is to install nix. It tries to achieve build purity everywhere. If /nix is a symlink, then it can yield impure results.

See the docs about it: https://nixos.org/manual/nix/stable/command-ref/env-common.html#env-NIX_IGNORE_SYMLINK_STORE

jlebon commented 1 year ago

Yup, the approach supports directories and it wouldn't be hard to add them. The reason I didn't for now is that it's harder to support in the downstreams I work on (see commit message). I think it's surmountable though. Regardless of what we do there, it'd make sense to support in libostree.

cgwalters commented 8 months ago

I think I mentioned this elsewhere too but: Another thing that might make sense is to switch to a tmpfs for / by default; we'd mount in /usr, /etc and /var, and create copies of the "base stuff" from the rootfs (e.g. /proc, the /lib -> /usr/lib symlink etc.). We'd probably still have an immutable bit on / by default but anyone who wanted to create empty toplevel directories and such could just chattr -i / and use systemd mount units to create directories there on boot.

We could also add an option to just turn off the immutable bit by default for admins who Know What They're Doing.

travier commented 8 months ago

Just for information, the solution in https://github.com/coreos/rpm-ostree/issues/337#issuecomment-1000923022 is racy (see https://github.com/containers/podman/pull/20612).

Until there is proper support in rpm-ostree/ostree, you have two options:

Setup all directories in a single unit
Create 3 units that are strictly ordered one after the others:
- the first unit does the chattr -i
- then the second unit is a template unit that does the mkdir for each mount point
- then the last does the chattr +i

cgwalters commented 6 months ago

Note that https://github.com/ostreedev/ostree/pull/3114 effectively adds support for this (among other things)

cgwalters commented 6 months ago

For custom edge image builds, I think it's just a matter of creating the directories for each target mount point specified as part of the ostree (image) build (instead of trying to create them at runtime).

coreos / rpm-ostree

Support empty toplevel mount points #337

ostree and toplevel directories