containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.71k stars 2.41k forks source link

`systemctl reboot` inside podman machine clears volume mounts #15976

Open iamkirkbater opened 2 years ago

iamkirkbater commented 2 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I'm running podman on a new M1 mac, and I need to enable the qemu-user-static package in order to support multiple architectures of packages. I've noticed that when I systemctl reboot from inside the podman machine, as the machine comes back online it does not have my defined mounts anymore.

Steps to reproduce the issue:

  1. podman machine init -v $HOME:$HOME
  2. podman machine ssh
  3. ls / - note that you see a Users directory
  4. sudo systemctl reboot
  5. wait for the machine to reboot and podman machine ssh
  6. ls / - note there is no longer a Users directory

Describe the results you received: The mounts defined as part of the init process or in containers.conf are not present when rebooting from within the machine.

Describe the results you expected: I'd expect the mounts to still be present after a reboot.

Additional information you deem important (e.g. issue happens only occasionally): The workaround is simple, you just want to podman machine stop && podman machine start but it's still just an extra step when you'd expect this to just come back online with the correct mounts.

Output of podman version:

Client:       Podman Engine
Version:      4.2.1
API Version:  4.2.1
Go Version:   go1.18.6
Built:        Tue Sep  6 15:16:02 2022
OS/Arch:      darwin/arm64

Server:       Podman Engine
Version:      4.2.0
API Version:  4.2.0
Go Version:   go1.18.4
Built:        Thu Aug 11 10:43:11 2022
OS/Arch:      linux/arm64

Output of podman info:

host:
  arch: arm64
  buildahVersion: 1.27.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.4-2.fc36.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.4, commit: '
  cpuUtilization:
    idlePercent: 96.19
    systemPercent: 3.27
    userPercent: 0.53
  cpus: 1
  distribution:
    distribution: fedora
    variant: coreos
    version: "36"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 5.19.9-200.fc36.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 1717428224
  memTotal: 2051633152
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.6-2.fc36.aarch64
    path: /usr/bin/crun
    version: |-
      crun version 1.6
      commit: 18cf2efbb8feb2b2f20e316520e0fd0b6c41ef4d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.aarch64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 6m 47.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106825756672
  graphRootUsed: 2307162112
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/501/containers
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.2.0
  Built: 1660228991
  BuiltTime: Thu Aug 11 10:43:11 2022
  GitCommit: ""
  GoVersion: go1.18.4
  Os: linux
  OsArch: linux/arm64
  Version: 4.2.0

Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):

> brew info podman
==> podman: stable 4.2.1 (bottled), HEAD
Tool for managing OCI containers and pods
https://podman.io/
/opt/homebrew/Cellar/podman/4.2.1 (178 files, 48MB) *
  Poured from bottle on 2022-09-27 at 12:02:00
From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/podman.rb
License: Apache-2.0
==> Dependencies
Build: go-md2man, go@1.18
Required: qemu
==> Options
--HEAD
    Install HEAD version
==> Caveats
zsh completions have been installed to:
  /opt/homebrew/share/zsh/site-functions
==> Analytics
install: 25,576 (30 days), 63,893 (90 days), 211,139 (365 days)
install-on-request: 24,798 (30 days), 62,485 (90 days), 209,434 (365 days)
build-error: 1 (30 days)

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.): This is running on an M1Pro Macbook.

I'm happy to provide any additional details as needed. Thanks so much for all of your work on this, I'm excited that this is actually a viable alternative to docker now!

mheon commented 2 years ago

@baude @ashley-cui PTAL

Luap99 commented 2 years ago

I think this is the expected behaviour given that we currently manually make the mount calls via ssh after start, see https://github.com/containers/podman/blob/dca5ead2d7ad8ac3b14fed6736c102b571d8baf1/pkg/machine/qemu/machine.go#L661-L690

I you want to fix this you have to mount this inside the VM on boot. I think you could create the proper systemd units via ignition or append lines to fstab. Contributions welcome.

iamkirkbater commented 2 years ago

Thanks for pointing me in the right direction. I'll see what I can do with the time I have to play with this.

afbjorklund commented 2 years ago

The implementation assumes "stop" and "start", someone converted the ssh commands to a systemd service which would make it run in boot. Possibly making it harder to mount and umount while running ?

iamkirkbater commented 2 years ago

I'm sorry, I'm not quite sure I'm following that last comment. So the implementation is designed for a physical podman machine stop && podman machine start I think I understand that part. I'm not quite understanding the second part of that. Do you mean to say that someone deviated from the original intention, or that someone could deviate from the original intention by using the systemd service?

afbjorklund commented 2 years ago

It was mostly an implementation detail. I think I abstracted the mount and umount into functions, so the same would be needed with the systemd service. If it is generated by the ignition, then it would be less obvious how to manipulate it at runtime

EDIT: no, must have been dreaming up that part. Basically the original machine never supported reboot properly, and the quick-and-dirty implementation just followed. Setting up a service to re-mount at boot, would be the proper way to go*.

* it would also avoid a whole lot of quirks when waiting for ssh to first come online, especially in the mac version

the old systemd implementation I was thinking of was from https://github.com/containers/podman/issues/8016#issuecomment-1034091359 (the actual code was https://github.com/containers/podman/commit/6412ed9fc13835e197ef8165bec1a3825fb3d239)

iamkirkbater commented 2 years ago

Fair enough. I just noticed this happened overnight when I went to go create a container this morning and it failed with the statfs /Users/me/path/to/file: no such file or directory error, and when I looked at the tab I was logged into podman machine with I see the following:

[core@localhost ~]$
Broadcast message from Zincati at Tue 2022-10-04 15:12:24 UTC:
New update 36.20221001.2.0 is available and has been deployed.
If permitted by the update strategy, Zincati will reboot into this update when
all interactive users have logged out, or in 10 minutes, whichever comes
earlier. Please log out of all active sessions in order to let the auto-update
process continue.
[core@localhost ~]$
Broadcast message from Zincati at Tue 2022-10-04 15:21:24 UTC:
New update 36.20221001.2.0 is available and has been deployed.
If permitted by the update strategy, Zincati will reboot into this update when
all interactive users have logged out, or in 1 minute, whichever comes
earlier. Please log out of all active sessions in order to let the auto-update
process continue.
Connection to localhost closed by remote host.

So I'm hoping I can get to this sooner rather than later, especially if the podman backing machine is going to reboot on it's own which will then cause subsequent container runs that previously worked to not work again.

I'm lucky I saw the error message, because I didn't run the reboot myself this time and I would have been super confused why all of a sudden the mounts I had setup disappeared.

iamkirkbater commented 2 years ago

I had a few moments to poke at this last Friday and played around with the idea of just dynamically writing the /etc/fstab as part of the podman machine start command. To test this, I manually wrote the /etc/fstab file on the machine, and then ran sudo reboot. This worked, the new volumes came back up on reboot! However, when I attempted to podman machine stop && podman machine start again after, the subsequent podman machine start command failed because the volumes already existed.

I think what I'll need to do in order to get this working is modify the existing podman machine init command to add any mounts to the ignition config file and have the ignition config both create the directory structure the user is looking for as well as the /etc/fstab file in order to automatically mount them on startup. Then I'll have to remove the ssh-mount commands from the start command.

The only thing that worries me about this is that for existing podman-machines, if the user updates the podman remote and then starts their podman machine back up, none of their already-configured mounts will come up. I'm wondering if there's some kind of annotation that could be added to the machine itself somehow as part of the new init and then we could mount via ssh if that annotation doesn't exist to preserve backwards compatibility.

The next steps for me is learning more about ignition configs and getting that part working first, though, then I'll submit a draft PR with that working and could probably have some further discussion once we get to that point.

afbjorklund commented 2 years ago

I think systemd handles /etc/fstab, like it handles everything else (soon)

https://www.freedesktop.org/software/systemd/man/systemd.mount.html

iamkirkbater commented 2 years ago

I'm hoping to get some time to work on it during my afternoon today, I'm just wondering what the (soon) meant? Is this already being worked on elsewhere? Or is it still worthwhile for me to try to work on this? I'd hate to spend a bunch of time trying to implement something if someone else is already working on it.

iamkirkbater commented 2 years ago

This comment is mostly just notes to myself as to where I am and what I've tried for when I get a chance to pick this back up.

I got some time to play around with systemd this morning and spent some time trying to dynamically create the systemd target.mount files: https://github.com/containers/podman/compare/main...iamkirkbater:podman:add-systemd-init

The main takeaways that I gleaned from this -

  1. On the machine itself, when I could get in (see point 2 below), I could not get the systemd mount to work. I think this has to do with the aforementioned immutable filesystem. As part of my debugging, I didn't originally name it Users-kbater.mount and had it named vol0.target and so it didn't work. Tried just getting this working on the machine itself via podman machine ssh and got the naming right, and then received the following error when viewing the logs:
    
    $ sudo systemctl reboot

podman machine ssh

$ sudo systemctl status Users-kbater.mount Oct 12 11:29:11 localhost mount[716]: mount: /Users/kbater: operation permitted for root only. Oct 12 11:29:11 localhost systemd[1]: Users-kbater.mount: Failed to check directory /Users/kbater: No such file or directory Oct 12 11:29:11 localhost systemd[1]: Mounting Users-kbater.mount - UserVolume Mount... Oct 12 11:29:11 localhost systemd[1]: Users-kbater.mount: Mount process exited, code=exited, status=1/FAILURE


2. After playing around with that a bit and not really getting anywhere, I started trying to just get the mount file automated and circle back to the permissions issue.  That's the code that you see above, but now it won't even let me ssh into the VM, as I get this error:

podman machine ssh kirk Connecting to vm kirk. To close connection, use ~. or exit Received disconnect from 127.0.0.1 port 49458:2: Too many authentication failures

~I'm thinking the authentication errors are due to the mount file failing, but I'm not sure how to really debug any deeper and figure that out. I could probably try to build my ssh keys directly into the instance's ignition config and then use that set of keys to try again, but I'm not sure if that would do anything other than just point out that the first error is still happening.~

I no longer believe the above paragraph, as it's still happening without any of my changes. I wonder if it's a transient networking issue with my workstation. Edit again: It was a transient issue with my workstation, rebooting fixed it.


Where am I going from here - So the systemd mount files don't seem to be doing what we want, but when I played with /etc/fstab the other day I noticed that it worked just fine on reboot, and would actually make subsequent start commands fail, so I'm thinking that I might go back to that route but via the ignition file, as the FCOS VM doesn't really like when systemd creates the mount files but maybe if it's in etcd/fstab it would work again?

I guess what's really confusing me at this point is why the command to disable the filesystem integrity stuff works via ssh, and then on a reboot it deletes the mountpoint, but maybe that's what the ssh command to flip back on the integrity stuff is doing though.

It doesn't look like there's a way to create a folder in the root directory via ignition, as the FCOS Ignition docs seem to hint that you can't: https://docs.fedoraproject.org/en-US/fedora-coreos/storage/#_immutable_read_only_usr.

So if the /etc/fstab also stuff doesn't work, I'm wondering if there needs to be a special mapping layer on podman-machine that does something like check / first but then check /mnt thereafter for files if they're not in /. So for example if I'm trying to mount a volume with -v /Users/kirk/Projects/podman:/podman in the container, it would first check /Users/kirk on the vm and then /mnt/Users/kirk, etc. Not specifically saying this is the way here, just mostly brain dumping this now so I can think about other things today and not have this bouncing around haha.

Thanks for tolerating me :)

iamkirkbater commented 2 years ago

And of course I couldn't help myself but to power through. I ended up figuring out how to get past the immutable filesystem stuff using the same commands that are being used with the ssh bits, just they're being run as a prereq to the mount. Had to add Restart=on-failure to the Mount itself because sometimes even though the Before directive is on the mkdir service it would still complain about the mountpoint not existing.

Because these live in the /etc/systemd/system directory, I believe they should persist through ostree updates.

I have a few todos left on that draft PR, like adding some more unit tests as well as adding some comments. I think the biggest problem with this will still be deciding how y'all want to help "migrate" existing VMs, I'm not sure what the appropriate path forward is and would be happy to help develop whatever that would be.

If there's any immediate feedback, I'm usually around on Slack. Thanks!

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 1 year ago

@ashley-cui Is this still broken?

ashley-cui commented 1 year ago

Still an issue, probably needs some design discussion on dynamic mounts

rhatdan commented 1 year ago

I thought someone added a PR to put the mounts in as systemd unit files, which would fire on reboot?

ashley-cui commented 1 year ago

Draft PR: https://github.com/containers/podman/pull/16143

iamkirkbater commented 1 year ago

Yeah, that was me who started that PR. The problem I'm having with it right now is inconsistency. Sometimes the systemd files all fire in order, sometimes it takes multiple reboots. Once they're all applied they run fine, seemingly forever, but it's not a better UX at all to have to fight the podman machine on your first startup.

I just haven't had a chance to poke at it again since the last update on it.

This is my first venture into systemd, so I could just be missing something simple.

xiaobitipao commented 1 year ago

Since the mount behavior occurs after the start, as long as the restart after the reboot should solve the problem. Although a bit wordy.

podman machine ssh sudo systemctl reboot
podman machine stop
podman machine start
github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 1 year ago

@iamkirkbater are you still working on this?

iamkirkbater commented 1 year ago

👋🏼 I haven't had a chance to work on this for a while due to conflicting priorities. I've just been restarting my podman machine with podman machine stop && sleep 3 && podman machine start but haven't had a chance to circle back to this yet.

iamkirkbater commented 1 year ago

I also think that I'm at the limits of my systemd knowledge with #16143 and I'm not sure if I'd be able to make any more progress with what I have there.

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 1 year ago

@Luap99 @ashley-cui @baude Is this still an issue?

Luap99 commented 1 year ago

yes

wnm3 commented 1 year ago

Related issue for MacOS. Using /etc/synthetic.conf I have a symlink from ~/store -> /store and attempting to mount /store/pipeline with the -v CLI option for the run command fails with the same Error: statfs /store/pipeline no such file or directory error which keeps the container from running.

BlaineEXE commented 1 year ago

Does anyone know whether this issue may be related to podman hanging often when my macbook wakes from sleep? I have also been experiencing this mount issue, but I only notice during the few instances of podman not hanging after I step away for a while.

Whenever I open my laptop, any "real" podman command hangs (e.g., podman images or podman ps), and I have to podman machine stop && sleep 1 && podman machine start to get podman to run again.

I have theorized that the podman hanging issue may be related to the mount issue, presuming that the socket mount is failing when podman hangs, but I'm not sure if that's the case. Could anyone suggest what the best ways for me to debug the hanging issue would be?

F30 commented 11 months ago

This is a really annoying issue given that it also affects automatic updates of the Podman machine. (I assume it's the same bug).

Zincati will apply updates to the machine at random times by default (or at defined times if you configure a custom update strategy). As part of that, the machine will get rebooted and all mounts will be lost, e.g. leading to VS Code dev containers crashing.

wnm3 commented 11 months ago

I'd received a solution in this issue for enabling mounting of special symbolic links that requires additional commands when creating the podman machine. It works fine until I reboot or the machine is restarted. It turns out the settings I needed can be added in a containers.conf file.

Until I learned this, I used the script below to reinitialize podman's machine:

#! /bin/bash
echo "Answer yes below to remove the existing machine"
podman machine rm
podman machine init -v /Users:/Users -v /private:/private -v /var/folders:/var/folders -v /store:/store
podman machine start

By allowing some additional mount commands to be run when initializing the machine it might allow this issue to be solved as well.