Closed rkkoszewski closed 5 years ago
Removing the patch 0003-docker-fix-problem-stopping-container.patch seems to solve the problem for me. Can you confirm?
The patch has been included by upstream and is needed for running in docker. Hopefully it's possible to improve it to make it work in both lxc/lxd and docker.
Hi @mikma, thanks for looking into this. I had a look at the upstream patch: https://git.openwrt.org/?p=project/procd.git;a=commitdiff;h=832369078d818d19ab64051fdc8da9e06c90ad88
I think it must be because of the missing reboot event when running from a container. An idea would be to add:
reboot(reboot_event);
Before the
exit(0);
(It should not trigger a kernel panic) I will test that out tomorrow.
EDIT:
Just tested the change and reboot is working now. Will submit a PR in a moment. Shutdown also still works as expected. This should also work fine for Docker, but I have not tested it.
This issue is still present for me. I have built a new image, made a test container, but reboot still shuts down the container permanently.
What I did:
./build.sh -p "luci-theme-material luci-app-adblock luci-app-ddns luci-app-wol iptables-mod-checksum"
lxc image import bin/openwrt-18.06.4-x86-64-lxd.tar.gz --alias openwrt-18.06.4
lxc launch openwrt-18.06.4 router-test -c security.privileged="true"
lxc exec router-test passwd root
Then I tried to reboot from inside the container (reboot now
command and Luci interface)
I'm using the latest version of your script:
git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
Container config:
lxc config show router-test
architecture: x86_64
config:
image.architecture: x86_64
image.description: OpenWrt 18.06.4 r7808-ef686b7292
image.os: OpenWrt
image.release: 18.06.4
security.privileged: "true"
volatile.base_image: b548b330fc144bc4d4f07e3fe4469edf8839aaa71cb17f81207ed55be24a788f
volatile.eth0.hwaddr: 00:16:3e:a3:3f:a9
volatile.idmap.base: "0"
volatile.idmap.next: '[]'
volatile.last_state.idmap: '[]'
volatile.last_state.power: STOPPED
devices:
eth0:
name: eth0
nictype: bridged
parent: br0
type: nic
ephemeral: false
profiles:
- default
stateful: false
description: ""
lxc --version
3.0.3
EDIT:
I have tested the Alpine image, the reboot now
command works fine in it.
EDIT 2:
I have removed the security.privileged: "true"
setting and updated to lxc 3.16
but still no success.
I have built a new image, made a test container, but reboot still shuts down the container permanently.
It should work. Have you tried deleting bin/
, build_dir/
and dl/
or starting from a fresh git clone? Changes to the patches won't automatically cause the procd package to be rebuilt, which means you may use a procd package built from an older version of the patches.
It should work. Have you tried deleting
bin/
,build_dir/
anddl/
or starting from a fresh git clone? Changes to the patches won't automatically cause the procd package to be rebuilt, which means you may use a procd package built from an older version of the patches.
Thank you for the quick reply!
Yes it works now, I have figured it out myself that the problem is the cached procd
package.
I wanted to add another edit to my comment, but i haven't got IPV4 connectivity because of the messed up if statement (db471ef). :)
Hi, I'm trying to reboot the OpenWrt container from inside the container, but it shuts down permanently rather than restart. Rebooting is especially useful with the watchdog plugin, that allows to restart the router when an event happens or just to perform a periodical reboot of the router.
I have tested with an Alpine Linux container and a Debian 9 container and in both cases when I run "reboot now" inside the container, the container reboots and starts again without any issues.
I guess that the init or procd process needs to somehow signal the parent lxc monitor process that it is trying to reboot rather than to shut down (Via ACPI?). Looking at htop when rebooting a Alpine or Debian container, I was able to observe that the whole init process of the container also shuts down, which seems like it is no "hacky soft reboot", but it is the lxc process that properly restarts the container there.
I'm running the container in LXC 3.1.0.
EDIT:
Some potential information: http://man7.org/linux/man-pages/man2/reboot.2.html (Behavior inside PID namespaces)
EDIT 2:
When I kill "procd" with signal SIGBUS the container reboots successfully. Maybe the issue is with the reboot command?