Open FransUrbo opened 11 years ago
This naturally worked before, but there have been some changes to ZoL a couple of days ago. I just did the pull without thinking... But something there must be breaking the crypto stuff..
Strange is that it doesn't seem to happen on every system...
May 19 11:59:02 kernel: [ 33.593085] SPL: Loaded module v0.6.1-311_g4baa286
May 19 11:59:02 kernel: [ 33.593514] zunicode: module license 'CDDL' taints kernel.
May 19 11:59:02 kernel: [ 33.593520] Disabling lock debugging due to kernel taint
May 19 11:59:02 kernel: [ 33.615204] ZFS: Loaded module v0.6.1-1.crypto, ZFS pool version 5000, ZFS filesystem version 5
May 19 11:59:10 kernel: [ 41.797092] loop: module loaded
May 19 11:59:11 kernel: [ 42.013232] SPL: Failed user helper '/bin/sh -c exec 0</dev/null 1>/proc/sys/kernel/spl/hostid 2>/dev/null; hostid', rc = 32512
May 19 11:59:12 kernel: [ 43.338524] EFI Variables Facility v0.08 2004-May-17
May 19 12:00:44 kernel: [ 135.021134] spl-crypto: No such AEAD cipher 'CKM_AES_GCM'.
May 19 12:00:44 kernel: [ 135.021134] Please ensure the correct kernel modules has been loaded,
May 19 12:00:44 kernel: [ 135.021134] Linux name 'sun-gcm(aes)'
May 19 12:00:52 kernel: [ 143.097074] VERIFY3(0 == dsl_crypto_key_create(dd, dsphys, dsobj, dcc, tx)) failed (0 == 4)
May 19 12:00:52 kernel: [ 143.097206] SPLError: 5538:0:(dsl_dataset.c:840:dsl_dataset_create_sync_dd()) SPL PANIC
May 19 12:00:52 kernel: [ 143.097262] SPL: Showing stack for process 5538
May 19 12:00:52 kernel: [ 143.097266] Pid: 5538, comm: txg_sync Tainted: P O 3.9.0-rc6+tf.1 #3
May 19 12:00:52 kernel: [ 143.097268] Call Trace:
May 19 12:00:52 kernel: [ 143.097281] [<ffffffffa051aad9>] ? spl_debug_dumpstack+0x26/0x2c [spl]
May 19 12:00:52 kernel: [ 143.097285] [<ffffffffa051bb38>] ? spl_debug_bug+0x7f/0xc8 [spl]
May 19 12:00:52 kernel: [ 143.097309] [<ffffffffa05ef225>] ? dsl_dataset_create_sync_dd+0x3a8/0x5aa [zfs]
May 19 12:00:52 kernel: [ 143.097323] [<ffffffffa05ef4be>] ? dsl_dataset_create_sync+0x97/0x1bc [zfs]
May 19 12:00:52 kernel: [ 143.097336] [<ffffffffa05db876>] ? dmu_objset_create_sync+0x3b/0x125 [zfs]
May 19 12:00:52 kernel: [ 143.097352] [<ffffffffa0628ff5>] ? zcrypt_mech_available+0x26/0x31 [zfs]
May 19 12:00:52 kernel: [ 143.097368] [<ffffffffa05fe5dc>] ? dsl_sync_task_group_sync+0x11d/0x1fe [zfs]
May 19 12:00:52 kernel: [ 143.097373] [<ffffffff8101773f>] ? read_tsc+0x5/0x16
May 19 12:00:52 kernel: [ 143.097388] [<ffffffffa05f78d3>] ? dsl_pool_sync+0x370/0x4c0 [zfs]
May 19 12:00:52 kernel: [ 143.097405] [<ffffffffa0607fb4>] ? spa_sync+0x4f8/0x8ff [zfs]
May 19 12:00:52 kernel: [ 143.097410] [<ffffffffa0525bdc>] ? __gethrtime+0xc/0x1e [spl]
May 19 12:00:52 kernel: [ 143.097412] [<ffffffff8101773f>] ? read_tsc+0x5/0x16
May 19 12:00:52 kernel: [ 143.097416] [<ffffffff8107e624>] ? ktime_get_ts+0x49/0xbb
May 19 12:00:52 kernel: [ 143.097432] [<ffffffffa06173ce>] ? txg_sync_thread+0x2bf/0x4a3 [zfs]
May 19 12:00:52 kernel: [ 143.097436] [<ffffffff8106984b>] ? set_user_nice+0x119/0x13d
May 19 12:00:52 kernel: [ 143.097452] [<ffffffffa061710f>] ? txg_thread_exit+0x2b/0x2b [zfs]
May 19 12:00:52 kernel: [ 143.097456] [<ffffffffa052146b>] ? __thread_create+0x2df/0x2df [spl]
May 19 12:00:52 kernel: [ 143.097460] [<ffffffffa05214d5>] ? thread_generic_wrapper+0x6a/0x73 [spl]
May 19 12:00:52 kernel: [ 143.097464] [<ffffffff8105d8a7>] ? __init_kthread_worker+0x2d/0x2d
May 19 12:00:52 kernel: [ 143.097466] [<ffffffff8105d955>] ? kthread+0xae/0xb6
May 19 12:00:52 kernel: [ 143.097469] [<ffffffff8105d8a7>] ? __init_kthread_worker+0x2d/0x2d
May 19 12:00:52 kernel: [ 143.097473] [<ffffffff8135263c>] ? ret_from_fork+0x7c/0xb0
May 19 12:00:52 kernel: [ 143.097475] [<ffffffff8105d8a7>] ? __init_kthread_worker+0x2d/0x2d
This without first loading sun-gcm. But loading all the sun-* modules, I still get the
spl-crypto: No such AEAD cipher 'CKM_AES_GCM'.
And just to double check - creating a hostid, loading the module and THEN trying to create the fs don't help either...
On a system where it seems to be working (I don't get a panic etc):
May 18 15:01:19 debianzfs kernel: [ 9216.838006] spl-crypto: Cipher test 'CKM_AES_GCM' -> 'sun-gcm(aes)' successful.
May 18 15:01:19 debianzfs kernel: [ 9217.054502] spl-crypto: Cipher test 'CKM_AES_CCM' -> 'sun-ccm(aes)' successful.
Ok, managed to find the problem: I was missing the gf128mul and ghash-generic kernel modules... I'll have to rebuild my kernel package image.
Should sun-gcm depend on ghash-generic perhaps? Feel free to close if not.
Yes, it would be nice if those were automatic dependencies. I do not know how that works, but I can always learn. I don't have time at the moment..
There's of course the libzfs_load_module(), but I'm not sure... I wonder why it's not automatic already...
Actually, I think basically we want to populate 'modules.dep' with the correct information, for example splat has
extra/splat/splat.ko: extra/spl/spl.ko kernel/lib/zlib_deflate/zlib_deflate.ko
So to load splat, it is told to load zlib_deflate and spl.
I am not entirely sure, but it might be related to this line;
splat.mod.c
static const char __module_depends[]
__used
__attribute__((section(".modinfo"))) =
"depends=spl";
which makes me think we should try adding ghash-generic to sun-gcm's dependency?
Unfortunately, that file is auto-generated by the MODPOST target in the kernel source directory. I've been trying to find out how to 'trick' or 'force' a dependency, but have not yet found any.
Looking at ghash-generic.c, I see this:
* GHASH: digest algorithm for GCM (Galois/Counter Mode).
*
* Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi>
* Copyright (c) 2009 Intel Corp.
* Author: Huang Ying <ying.huang@intel.com>
*
* The algorithm implementation is copied from gcm.c.
and in gcm.c:
* GCM: Galois/Counter Mode.
*
* Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi>
So it almost looks like ghash is a newer version of sun-gcm!
I'm starting to be very, very confused! IF (!) sun-gcm is to be dependent on ghash-generic, why isn't this discovered already? Modpost should have caught that from what I can see of the < kernelsrc >/scripts/Makefile.modpost.
Maybe sun-gcm is simply missing the vital part to make this work correctly?
I wouldn't worry about the header, the sun-gcm module was made by copying in something else and working on it.
So I still think you should look at the bottom of sun-gcm.mod.c, then add;
static const char __module_depends[]
__used
__attribute__((section(".modinfo"))) =
"depends=ghash-generic";
and make sure that says depends=ghash-generic, make install and confirm it populates modules.dep.
As I said, there is no point in adding it to sun-gcm.mod.c, because that file is auto-generated by the 'modpost' target (stage two):
root@sid64:/usr/local/src/spl/module# ll sun-gcm/
total 1104
-rwxr-xr-x 1 root root 23203 May 25 01:36 gcm.c*
-rw-r--r-- 1 root root 320992 Jun 7 17:57 gcm.o
-rw-r--r-- 1 root root 302 Jun 7 15:58 Makefile
-rwxr-xr-x 1 root root 247 May 25 01:36 Makefile.in*
-rw-r--r-- 1 root root 52 Jun 7 17:57 modules.order
-rw-r--r-- 1 root root 370802 Jun 7 17:57 sun-gcm.ko
-rw-r--r-- 1 root root 1663 Jun 7 17:57 sun-gcm.mod.c
-rw-r--r-- 1 root root 51368 Jun 7 17:57 sun-gcm.mod.o
-rw-r--r-- 1 root root 321000 Jun 7 17:57 sun-gcm.o
root@sid64:/usr/local/src/spl/module# rm sun-gcm/*.o sun-gcm/sun-gcm.* sun-gcm/modules.order sun-gcm/.*.cmd
rm: cannot remove ###n-gcm/sun-gcm.mod.o###No such file or directory
rm: cannot remove ###n-gcm/sun-gcm.o###No such file or directory
root@sid64:/usr/local/src/spl/module# ls -la sun-gcm/
total 40
drwxr-xr-x 2 root root 4096 Jun 8 13:52 ./
drwxr-xr-x 8 root root 4096 Jun 7 17:57 ../
-rwxr-xr-x 1 root root 23203 May 25 01:36 gcm.c*
-rw-r--r-- 1 root root 302 Jun 7 15:58 Makefile
-rwxr-xr-x 1 root root 247 May 25 01:36 Makefile.in*
root@sid64:/usr/local/src/spl/module# make modules
make -C /usr/src/linux-headers-3.8-2-amd64 SUBDIRS=`pwd` O=/usr/src/linux-headers-3.8-2-amd64 CONFIG_SPL=m modules
make[1]: Entering directory `/usr/src/linux-headers-3.8-2-amd64'
CC [M] /usr/local/src/spl/module/sun-gcm/../../module/sun-gcm/gcm.o
LD [M] /usr/local/src/spl/module/sun-gcm/sun-gcm.o
Building modules, stage 2.
MODPOST 5 modules
CC /usr/local/src/spl/module/sun-gcm/sun-gcm.mod.o
LD [M] /usr/local/src/spl/module/sun-gcm/sun-gcm.ko
make[1]: Leaving directory `/usr/src/linux-headers-3.8-2-amd64'
root@sid64:/usr/local/src/spl/module# vi sun-gcm/sun-gcm.mod.c
[add 'ghash-generic' to 'depends=' line]
root@sid64:/usr/local/src/spl/module# tail -n2 sun-gcm/sun-gcm.mod.c
"depends=ghash-generic";
root@sid64:/usr/local/src/spl/module# make modules
make -C /usr/src/linux-headers-3.8-2-amd64 SUBDIRS=`pwd` O=/usr/src/linux-headers-3.8-2-amd64 CONFIG_SPL=m modules
make[1]: Entering directory `/usr/src/linux-headers-3.8-2-amd64'
Building modules, stage 2.
MODPOST 5 modules
CC /usr/local/src/spl/module/sun-gcm/sun-gcm.mod.o
LD [M] /usr/local/src/spl/module/sun-gcm/sun-gcm.ko
make[1]: Leaving directory `/usr/src/linux-headers-3.8-2-amd64'
root@sid64:/usr/local/src/spl/module# tail -n2 sun-gcm/sun-gcm.mod.c
"depends=";
<behlendorf> lundman: Kernel module X doesn't know persay that it has a
dependency on module Y. What module X knows is that it requires
specific symbols, modprobe is smart enough to work out what
modules provide those symbols.
<behlendorf> lundman: See the depmod utility which handles this.
<behlendorf> If the dependencies can't be worked out automatically by what's
been made available through EXPORT_SYMBOL, can you have module X
request module Y be loaded with request_module().
UPDATE: This seems to be a missing module not being loaded automatically. See comment https://github.com/zfsrogue/zfs-crypto/issues/28#issuecomment-18117234. Previous issue title: dsl_crypto_key_create() => SPL PANIC
I get a SPL PANIC when trying to create a filesystem.
I first thought it was the combination of options, but after a reboot and trying again:
gave me the same error. Just for completness:
naturally worked :(
BUT, and this might be a hint:
and then spl crashes again. If I reboot the system (hard reset), the exact same thing happens - it ask to upgrade the pool and zfs create crashes...
Every time I try to use aes-256-gcm, it wants me to upgrade the pool and the zfs create crashes when I try again...
Using aes-128-ccm, aes-192-ccm, aes-256-ccm all work. But it seems that any gcm type don't work..
Looking at modules loaded, the pool is requested to be upgraded when the gcm module isn't loaded. But trying again, but first loading the module doesn't make any difference.
So:
I'm not sure what else to test, but if there's something special, just let me know and I'll do it.