Closed victorgp closed 8 years ago
the error SELinux: mount invalid. Same superblock, different security settings for (dev mqueue, type mqueue)
is not related to the panic afaict, and happens during 'normal' operation.
can you try to reproduce this with alpha 1068.0.0 with the 4.6.0 kernel?
@mischief Yes, this doesn't happen with 1068.0
Having a server with stable version and a server with alpha version, running the same containers (moving them with Kubernetes) the kernel panic is easily reproducible in the stable version, right after some minutes of having some containers running, it crashes. And, the alpha version seems robust, it doesn't crash.
I'm surprised you quickly proposed using alpha version, is this issue something you were already aware? is this related to the new kernel version?
It looks like the stable CoreOS version is not so stable, luckily we weren't running this in production, because this took our whole cluster down.
@victorgp no, not an issue i've been aware of. it's just that sometimes bugs are fixed in newer kernels, so it's always worth a try to get another data point.
I can confirm having this very same error after upgrading to stable 1122.2.0 although it doesn't lead to kernel panic but the server will hang if I initiate a reboot unless it's forcefully rebooted.
NAME=CoreOS
ID=coreos
VERSION=1122.2.0
VERSION_ID=1122.2.0
BUILD_ID=2016-09-06-1449
PRETTY_NAME="CoreOS 1122.2.0 (MoreOS)"
@marineam @mischief is this a bug confirmed in 1122.2.0? I'd like to know before upgrading my nodes. Thanks
The message is harmless and unrelated to any other issues that are being seen. Please open a separate issue for other specific problems (such as the failure to reboot) so we can ensure that they're handled appropriately, thanks!
@victorgp The crash you were seeing should certainly be fixed in 1122.
Issue Report
Bug
CoreOS Version
NAME=CoreOS ID=coreos VERSION=1010.5.0 VERSION_ID=1010.5.0 BUILD_ID=2016-05-26-2225 PRETTY_NAME="CoreOS 1010.5.0 (MoreOS)" ANSI_COLOR="1;32" HOME_URL="https://coreos.com/" BUG_REPORT_URL="https://github.com/coreos/bugs/issues"
Environment
Baremetal servers
Expected Behavior
OS doesn't reboot by a kernel panic
Actual Behavior
After some minutes the server reboots due to a kernel panic
Reproduction Steps
We've been using the stable 1010.5.0 version since it was released and we didn't have any issue. We added (using Kubernetes) more and more containers until it seems we have reached a limit were a kernel panic was provoked. The moment we start Docker we start seeing in journald dmesg errors like:
And after a bunch of those errors, the kernel panic happens and the server reboots, this is the stack trace: