zfsrogue / zfs-crypto

ZFS On Linux with crypto patches
Other
39 stars 7 forks source link

Missing module dependency #28

Open FransUrbo opened 11 years ago

FransUrbo commented 11 years ago

UPDATE: This seems to be a missing module not being loaded automatically. See comment https://github.com/zfsrogue/zfs-crypto/issues/28#issuecomment-18117234. Previous issue title: dsl_crypto_key_create() => SPL PANIC

I get a SPL PANIC when trying to create a filesystem.

# zfs create -o compression=lz4 -o copies=2 -o dedup=on -o encryption=aes-256-gcm -o keysource=raw,file:///boot/zfs.key system/ROOT/debian
[  377.241515] VERIFY3(0 == dsl_crypto_key_create(dd, dsphys, dsobj, dcc, tx)) failed (0 == 4)
[  377.241636] SPLError: 6259:0:(dsl_dataset.c:840:dsl_dataset_create_sync_dd()) SPL PANIC

I first thought it was the combination of options, but after a reboot and trying again:

# zfs create -o encryption=aes-256-gcm -o keysource=raw,file:///boot/zfs.key system/ROOT/debian

gave me the same error. Just for completness:

# zfs create -o encryption=on system/ROOT/debian
[....]

naturally worked :(

BUT, and this might be a hint:

# zfs create -o encryption=aes-256-gcm system/ROOT/debian
[....]
cannot create 'system/ROOT/debian': pool must be upgraded to set this property or value
# zpool upgrade -a
This system supports ZFS pool feature flags.
All pools are already formated using feature flags.
Every feature flags pool already has all supported features enabled.
# zfs create -o encryption=aes-256-gcm system/ROOT/debian
[....]

and then spl crashes again. If I reboot the system (hard reset), the exact same thing happens - it ask to upgrade the pool and zfs create crashes...

Every time I try to use aes-256-gcm, it wants me to upgrade the pool and the zfs create crashes when I try again...

Using aes-128-ccm, aes-192-ccm, aes-256-ccm all work. But it seems that any gcm type don't work..

Looking at modules loaded, the pool is requested to be upgraded when the gcm module isn't loaded. But trying again, but first loading the module doesn't make any difference.

So:

# modprobe sun-gcm
# zfs create -o encryption=aes-256-gcm system/ROOT/debian
[....]
cannot create 'system/ROOT/debian': pool must be upgraded to set this property or value
# zfs create -o encryption=aes-256-gcm system/ROOT/debian
[SPL PANIC]

I'm not sure what else to test, but if there's something special, just let me know and I'll do it.

FransUrbo commented 11 years ago

This naturally worked before, but there have been some changes to ZoL a couple of days ago. I just did the pull without thinking... But something there must be breaking the crypto stuff..

FransUrbo commented 11 years ago

Strange is that it doesn't seem to happen on every system...

FransUrbo commented 11 years ago
May 19 11:59:02 kernel: [   33.593085] SPL: Loaded module v0.6.1-311_g4baa286
May 19 11:59:02 kernel: [   33.593514] zunicode: module license 'CDDL' taints kernel.
May 19 11:59:02 kernel: [   33.593520] Disabling lock debugging due to kernel taint
May 19 11:59:02 kernel: [   33.615204] ZFS: Loaded module v0.6.1-1.crypto, ZFS pool version 5000, ZFS filesystem version 5
May 19 11:59:10 kernel: [   41.797092] loop: module loaded
May 19 11:59:11 kernel: [   42.013232] SPL: Failed user helper '/bin/sh -c exec 0</dev/null      1>/proc/sys/kernel/spl/hostid      2>/dev/null; hostid', rc = 32512
May 19 11:59:12 kernel: [   43.338524] EFI Variables Facility v0.08 2004-May-17
May 19 12:00:44 kernel: [  135.021134] spl-crypto: No such AEAD cipher 'CKM_AES_GCM'.
May 19 12:00:44 kernel: [  135.021134] Please ensure the correct kernel modules has been loaded,
May 19 12:00:44 kernel: [  135.021134] Linux name 'sun-gcm(aes)'
May 19 12:00:52 kernel: [  143.097074] VERIFY3(0 == dsl_crypto_key_create(dd, dsphys, dsobj, dcc, tx)) failed (0 == 4)
May 19 12:00:52 kernel: [  143.097206] SPLError: 5538:0:(dsl_dataset.c:840:dsl_dataset_create_sync_dd()) SPL PANIC
May 19 12:00:52 kernel: [  143.097262] SPL: Showing stack for process 5538
May 19 12:00:52 kernel: [  143.097266] Pid: 5538, comm: txg_sync Tainted: P           O 3.9.0-rc6+tf.1 #3
May 19 12:00:52 kernel: [  143.097268] Call Trace:
May 19 12:00:52 kernel: [  143.097281]  [<ffffffffa051aad9>] ? spl_debug_dumpstack+0x26/0x2c [spl]
May 19 12:00:52 kernel: [  143.097285]  [<ffffffffa051bb38>] ? spl_debug_bug+0x7f/0xc8 [spl]
May 19 12:00:52 kernel: [  143.097309]  [<ffffffffa05ef225>] ? dsl_dataset_create_sync_dd+0x3a8/0x5aa [zfs]
May 19 12:00:52 kernel: [  143.097323]  [<ffffffffa05ef4be>] ? dsl_dataset_create_sync+0x97/0x1bc [zfs]
May 19 12:00:52 kernel: [  143.097336]  [<ffffffffa05db876>] ? dmu_objset_create_sync+0x3b/0x125 [zfs]
May 19 12:00:52 kernel: [  143.097352]  [<ffffffffa0628ff5>] ? zcrypt_mech_available+0x26/0x31 [zfs]
May 19 12:00:52 kernel: [  143.097368]  [<ffffffffa05fe5dc>] ? dsl_sync_task_group_sync+0x11d/0x1fe [zfs]
May 19 12:00:52 kernel: [  143.097373]  [<ffffffff8101773f>] ? read_tsc+0x5/0x16
May 19 12:00:52 kernel: [  143.097388]  [<ffffffffa05f78d3>] ? dsl_pool_sync+0x370/0x4c0 [zfs]
May 19 12:00:52 kernel: [  143.097405]  [<ffffffffa0607fb4>] ? spa_sync+0x4f8/0x8ff [zfs]
May 19 12:00:52 kernel: [  143.097410]  [<ffffffffa0525bdc>] ? __gethrtime+0xc/0x1e [spl]
May 19 12:00:52 kernel: [  143.097412]  [<ffffffff8101773f>] ? read_tsc+0x5/0x16
May 19 12:00:52 kernel: [  143.097416]  [<ffffffff8107e624>] ? ktime_get_ts+0x49/0xbb
May 19 12:00:52 kernel: [  143.097432]  [<ffffffffa06173ce>] ? txg_sync_thread+0x2bf/0x4a3 [zfs]
May 19 12:00:52 kernel: [  143.097436]  [<ffffffff8106984b>] ? set_user_nice+0x119/0x13d
May 19 12:00:52 kernel: [  143.097452]  [<ffffffffa061710f>] ? txg_thread_exit+0x2b/0x2b [zfs]
May 19 12:00:52 kernel: [  143.097456]  [<ffffffffa052146b>] ? __thread_create+0x2df/0x2df [spl]
May 19 12:00:52 kernel: [  143.097460]  [<ffffffffa05214d5>] ? thread_generic_wrapper+0x6a/0x73 [spl]
May 19 12:00:52 kernel: [  143.097464]  [<ffffffff8105d8a7>] ? __init_kthread_worker+0x2d/0x2d
May 19 12:00:52 kernel: [  143.097466]  [<ffffffff8105d955>] ? kthread+0xae/0xb6
May 19 12:00:52 kernel: [  143.097469]  [<ffffffff8105d8a7>] ? __init_kthread_worker+0x2d/0x2d
May 19 12:00:52 kernel: [  143.097473]  [<ffffffff8135263c>] ? ret_from_fork+0x7c/0xb0
May 19 12:00:52 kernel: [  143.097475]  [<ffffffff8105d8a7>] ? __init_kthread_worker+0x2d/0x2d

This without first loading sun-gcm. But loading all the sun-* modules, I still get the

spl-crypto: No such AEAD cipher 'CKM_AES_GCM'.

And just to double check - creating a hostid, loading the module and THEN trying to create the fs don't help either...

On a system where it seems to be working (I don't get a panic etc):

May 18 15:01:19 debianzfs kernel: [ 9216.838006] spl-crypto: Cipher test 'CKM_AES_GCM' -> 'sun-gcm(aes)' successful.
May 18 15:01:19 debianzfs kernel: [ 9217.054502] spl-crypto: Cipher test 'CKM_AES_CCM' -> 'sun-ccm(aes)' successful.
FransUrbo commented 11 years ago

Ok, managed to find the problem: I was missing the gf128mul and ghash-generic kernel modules... I'll have to rebuild my kernel package image.

FransUrbo commented 11 years ago

Should sun-gcm depend on ghash-generic perhaps? Feel free to close if not.

lundman commented 11 years ago

Yes, it would be nice if those were automatic dependencies. I do not know how that works, but I can always learn. I don't have time at the moment..

FransUrbo commented 11 years ago

There's of course the libzfs_load_module(), but I'm not sure... I wonder why it's not automatic already...

lundman commented 11 years ago

Actually, I think basically we want to populate 'modules.dep' with the correct information, for example splat has

extra/splat/splat.ko: extra/spl/spl.ko kernel/lib/zlib_deflate/zlib_deflate.ko

So to load splat, it is told to load zlib_deflate and spl.

I am not entirely sure, but it might be related to this line;

splat.mod.c 
static const char __module_depends[]
           __used
           __attribute__((section(".modinfo"))) =
           "depends=spl";

which makes me think we should try adding ghash-generic to sun-gcm's dependency?

FransUrbo commented 11 years ago

Unfortunately, that file is auto-generated by the MODPOST target in the kernel source directory. I've been trying to find out how to 'trick' or 'force' a dependency, but have not yet found any.

FransUrbo commented 11 years ago

Looking at ghash-generic.c, I see this:

* GHASH: digest algorithm for GCM (Galois/Counter Mode).
*
* Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi>
* Copyright (c) 2009 Intel Corp.
*   Author: Huang Ying <ying.huang@intel.com>
*
* The algorithm implementation is copied from gcm.c.

and in gcm.c:

* GCM: Galois/Counter Mode.
*
* Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi>

So it almost looks like ghash is a newer version of sun-gcm!

I'm starting to be very, very confused! IF (!) sun-gcm is to be dependent on ghash-generic, why isn't this discovered already? Modpost should have caught that from what I can see of the < kernelsrc >/scripts/Makefile.modpost.

Maybe sun-gcm is simply missing the vital part to make this work correctly?

lundman commented 11 years ago

I wouldn't worry about the header, the sun-gcm module was made by copying in something else and working on it.

So I still think you should look at the bottom of sun-gcm.mod.c, then add;

static const char __module_depends[]
__used
__attribute__((section(".modinfo"))) =
"depends=ghash-generic";

and make sure that says depends=ghash-generic, make install and confirm it populates modules.dep.

FransUrbo commented 11 years ago

As I said, there is no point in adding it to sun-gcm.mod.c, because that file is auto-generated by the 'modpost' target (stage two):

root@sid64:/usr/local/src/spl/module# ll sun-gcm/
total 1104
-rwxr-xr-x 1 root root  23203 May 25 01:36 gcm.c*
-rw-r--r-- 1 root root 320992 Jun  7 17:57 gcm.o
-rw-r--r-- 1 root root    302 Jun  7 15:58 Makefile
-rwxr-xr-x 1 root root    247 May 25 01:36 Makefile.in*
-rw-r--r-- 1 root root     52 Jun  7 17:57 modules.order
-rw-r--r-- 1 root root 370802 Jun  7 17:57 sun-gcm.ko
-rw-r--r-- 1 root root   1663 Jun  7 17:57 sun-gcm.mod.c
-rw-r--r-- 1 root root  51368 Jun  7 17:57 sun-gcm.mod.o
-rw-r--r-- 1 root root 321000 Jun  7 17:57 sun-gcm.o
root@sid64:/usr/local/src/spl/module# rm sun-gcm/*.o sun-gcm/sun-gcm.* sun-gcm/modules.order sun-gcm/.*.cmd
rm: cannot remove ###n-gcm/sun-gcm.mod.o###No such file or directory
rm: cannot remove ###n-gcm/sun-gcm.o###No such file or directory
root@sid64:/usr/local/src/spl/module# ls -la sun-gcm/
total 40
drwxr-xr-x 2 root root  4096 Jun  8 13:52 ./
drwxr-xr-x 8 root root  4096 Jun  7 17:57 ../
-rwxr-xr-x 1 root root 23203 May 25 01:36 gcm.c*
-rw-r--r-- 1 root root   302 Jun  7 15:58 Makefile
-rwxr-xr-x 1 root root   247 May 25 01:36 Makefile.in*
root@sid64:/usr/local/src/spl/module# make modules
make -C /usr/src/linux-headers-3.8-2-amd64 SUBDIRS=`pwd`  O=/usr/src/linux-headers-3.8-2-amd64 CONFIG_SPL=m modules
make[1]: Entering directory `/usr/src/linux-headers-3.8-2-amd64'
  CC [M]  /usr/local/src/spl/module/sun-gcm/../../module/sun-gcm/gcm.o
  LD [M]  /usr/local/src/spl/module/sun-gcm/sun-gcm.o
  Building modules, stage 2.
  MODPOST 5 modules
  CC      /usr/local/src/spl/module/sun-gcm/sun-gcm.mod.o
  LD [M]  /usr/local/src/spl/module/sun-gcm/sun-gcm.ko
make[1]: Leaving directory `/usr/src/linux-headers-3.8-2-amd64'
root@sid64:/usr/local/src/spl/module# vi sun-gcm/sun-gcm.mod.c
[add 'ghash-generic' to 'depends=' line]
root@sid64:/usr/local/src/spl/module# tail -n2 sun-gcm/sun-gcm.mod.c
"depends=ghash-generic";

root@sid64:/usr/local/src/spl/module# make modules
make -C /usr/src/linux-headers-3.8-2-amd64 SUBDIRS=`pwd`  O=/usr/src/linux-headers-3.8-2-amd64 CONFIG_SPL=m modules
make[1]: Entering directory `/usr/src/linux-headers-3.8-2-amd64'
  Building modules, stage 2.
  MODPOST 5 modules
  CC      /usr/local/src/spl/module/sun-gcm/sun-gcm.mod.o
  LD [M]  /usr/local/src/spl/module/sun-gcm/sun-gcm.ko
make[1]: Leaving directory `/usr/src/linux-headers-3.8-2-amd64'
root@sid64:/usr/local/src/spl/module# tail -n2 sun-gcm/sun-gcm.mod.c
"depends=";                                                                                                                                                
lundman commented 11 years ago
<behlendorf> lundman: Kernel module X doesn't know persay that it has a
             dependency on module Y.  What module X knows is that it requires
             specific symbols, modprobe is smart enough to work out what
             modules provide those symbols.
<behlendorf> lundman: See the depmod utility which handles this.
<behlendorf> If the dependencies can't be worked out automatically by what's
             been made available through EXPORT_SYMBOL, can you have module X
             request module Y be loaded with request_module().