Closed nkskjames closed 8 years ago
This is userspace dying, not a kernel bug:
/sbin/init: error while loading shared libraries: libkmod.so.2: cannot open shared object file: Input/output error
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
userspace dying because of a corrupted shared library that was in rofs. How could that shared library get in rwfs? How could we debug that?
Your rofs is corrupt by the looks?
You could try booting with init=/bin/bash
? But I suspect you will come across similar issues.
i'm sure the rwfs is corrupt. i just don't understand how the .so got affected. the rwfs should be empty or mostly empty.
I guess it would have same root cause to issue 53.
After a new report I searched for the error and found several reports, but also a proposed patch. That patch highlighted the issue and lead to a patch being merged in 4.5.
The library isn't in rwfs, but the directory is because it was modified. The bug is reading the directories when mounting the file system gets confused by stale copies still in the flash and marks child directories as having unallowable hard links.
See https://lists.ozlabs.org/pipermail/openbmc/2016-March/002340.html for this and a few other patchs including a deadlock fix marked for stable to be picked up.
I think we have this under control as of https://github.com/openbmc/linux/commit/aeb4718beca0d07ff232341cdd544008e17f1fdc. Closing for now, please re-open if you see this again.
While doing power cycling to debug another issue, the rwfs got corrupted and we had to netboot to recover: No files were created in overlay since first initial flash of this system.