NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.38k stars 14.33k forks source link

xfs regressions on stable kernels #6231

Closed vcunat closed 9 years ago

vcunat commented 9 years ago

This is more of a heads up, as it's all out of my scope.

Stable kernel updates seem to break xfs. In nixos tests, one can see it e.g. on unstable http://hydra.nixos.org/job/nixos/trunk-combined/nixos.tests.installer.lvm.x86_64-linux, but the problem has been also backported to 14.04 and 14.12 branches.

I tried the job locally, getting:

It seems the problem isn't just NixOS-specific https://forum.manjaro.org/index.php?topic=20211.0, but I've found practically no other mention.

vcunat commented 9 years ago

CC @wkennington (updating kernels a lot).

wkennington commented 9 years ago

We should try this with the testing kernel and see if it persists in mainline.

wkennington commented 9 years ago

It's worth mentioning that it seems to affect both xfs and btrfs.

There is also a newly introduced bug in 3.18+ with the xhci stack not being initialized on haswell chips. I don't even know where to go with all of these kernel bugs we keep hitting that no one else seems to have found.

vcunat commented 9 years ago

I presume the upstream bugzilla is the right place? https://bugzilla.kernel.org/ The lack of (findable) mentions anywhere also confounds me. Surely many others must've hit it.

manjaro commented 9 years ago

This is weird. I see no xfs changes in 3.14.31 and neither in 3.12.37

vcunat commented 9 years ago

It might be connected to LVM, e.g. both changesets port this commit http://kernel.opensuse.org/cgit/kernel/commit/?id=9b1cc9f251affdd27f29fe46d0989ba76c33faf6. To solve this, someone will probably have to find the offending commit and test that it really introduces the error.

manjaro commented 9 years ago

Are you sure? Doing this make it work:

# modprobe crc32c
# modprobe xfs

I think it is more related to crypto ?!?

crypto-add-missing-crypto-module-aliases.patch crypto-include-crypto-module-prefix-in-template.patch crypto-prefix-module-autoloading-with-crypto.patch

manjaro commented 9 years ago

Upstream is now on it:

Am 09.02.2015 um 20:34 schrieb Kees Cook:
> Hi!
>
> Yeah, this does sound like a regression due to the crypto alias rework. 
> I'm looking through the code now trying to see how crc32c is loaded. 
> Other filesystems do things like this:
> crypto_alloc_shash("crc32c", ...
> but I can't find this in xfs. I'll keep digging...
>
> -Kees
vcunat commented 9 years ago

I see, I didn't check that manjaro thread today until now. Nice you pushed the issue upstream.

manjaro commented 9 years ago

Seems we found it:

On 9 February 2015 at 23:05, Greg Kroah-Hartman
 wrote:
> On Mon, Feb 09, 2015 at 01:26:17PM -0800, Kees Cook wrote:
>> But it DOES happen for me on 3.14.31. Hmmm.
>
> Did I mess up the backport somehow?
>
v3.14.31:crypto/crc32c.c is missing the MODULE_ALIAS_CRYPTO("crc32c").
That's probably because crypto/crc32c.c was renamed to
crypto/crc32c_generic.c in commit
06e5a1f29819759392239669beb2cad27059c8ec and therefore fell through
the cracks when backporting commit
5d26a105b5a73e5635eae0629b42fa0a90e07b7b.
So the affected kernels (all that backported the "crypto-" prefix
patches) need this additional patch:
diff --git a/crypto/crc32c.c b/crypto/crc32c.c
index 06f7018c9d95..aae5829eb681 100644
--- a/crypto/crc32c.c
+++ b/crypto/crc32c.c
@@ -167,6 +167,7 @@ static void __exit crc32c_mod_fini(void)
 module_init(crc32c_mod_init);
 module_exit(crc32c_mod_fini);
+MODULE_ALIAS_CRYPTO("crc32c");
 MODULE_AUTHOR("Clay Haapala ");
 MODULE_DESCRIPTION("CRC32c (Castagnoli) calculations wrapper for lib/crc32c");
 MODULE_LICENSE("GPL");
Mathias
vcunat commented 9 years ago

One line of the patch got malformed and should be MODULE_AUTHOR("Clay Haapala <chaapala@cisco.com>");, presumably because addresses get filtered in the e-mail to reduce spam.

I checked the patch fixes the xfs test problem, and pushed to master, 14.12, 14.04. Thanks!

philmmanjaro commented 9 years ago

Yeah, it got malformed. It is now adopted by upstream also. v3.10.69, v3.14.32