sfjro / aufs-standalone

27 stars 14 forks source link

BUG: looking up invalid subclass: 9 while exec cp operation in kernel6.1+aufs #23

Closed Siri-gl closed 1 year ago

Siri-gl commented 1 year ago

I made AUFS patch on kernel6.1: patch -p1 < ../kernel/aufs-standalone-aufs6.1/aufs6-kbuild.patch patch -p1 < ../kernel/aufs-standalone-aufs6.1/aufs6-base.patch patch -p1 < ../kernel/aufs-standalone-aufs6.1/aufs6-mmap.patch cp -a ../kernel/aufs-standalone-aufs6.1/{Documentation,fs} ./ cp ../kernel/aufs-standalone-aufs6.1/include/uapi/linux/aufs_type.h include/uapi/linux/aufs_type.h Then operate in initrd: mount /mnt/system/rootfs.squashfs /mnt/squashfs mount -t tmpfs tmpfs /mnt/aufs mount -t aufs -o dirs=/mnt/aufs=rw:/mnt/squashfs=ro aufs /mnt/aufs cp -a /mnt/system/rootfs.squashfs /mnt/aufs/

[ 82.806082] aufs test_add:291:mount[209]: uid/gid/perm /mnt/squashfs 0/0/0755, 0/0/01777 [ 225.629963] BUG: looking up invalid subclass: 9 [ 225.629983] turning off the locking correctness validator. [ 225.629996] CPU: 7 PID: 214 Comm: cp Not tainted 6.1.0 #2 [ 225.630014] Hardware name: LENOVO 20VD/LNVNB161216, BIOS F8CN38WW(V2.03) 04/09/2021 [ 225.630030] Call Trace: [ 225.630039] [ 225.630047] dump_stack_lvl+0x4a/0x65 [ 225.630062] dump_stack+0x10/0x16 [ 225.630071] look_up_lock_class+0xcf/0x120 [ 225.630086] register_lock_class+0x4a/0x4a0 [ 225.630904] ? reacquire_held_locks+0xc7/0x1d0 [ 225.631706] ? au_pin_hdir_lock+0x29/0x70 [ 225.632525] __lock_acquire.constprop.0+0x4d/0x550 [ 225.633326] lock_acquire+0xba/0x1a0 [ 225.634129] ? au_pin_and_icpup+0x2af/0x420 [ 225.634950] down_write_nested+0x33/0xc0 [ 225.635799] ? au_pin_and_icpup+0x2af/0x420 [ 225.636700] au_pin_and_icpup+0x2af/0x420 [ 225.637620] ? au_digen_test+0x5f/0x80 [ 225.638618] aufs_setattr+0x2d9/0x480 [ 225.639679] notify_change+0x28e/0x5c0 [ 225.640750] ? this_cpu_preempt_check+0x13/0x20 [ 225.641792] vfs_utimes+0x129/0x250 [ 225.642809] ? vfs_utimes+0x129/0x250 [ 225.643584] do_utimes+0xde/0x140 [ 225.644291] x64_sys_utimensat+0x7a/0xc0 [ 225.644940] do_syscall_64+0x37/0x90 [ 225.645582] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 225.646227] RIP: 0033:0x4ab92a [ 225.646855] Code: 73 01 c3 48 c7 c1 e0 ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 41 89 ca b8 18 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 c7 c2 e0 ff ff ff f7 [ 225.647514] RSP: 002b:00007ffde4ea5818 EFLAGS: 00000246 ORIG_RAX: 0000000000000118 [ 225.648119] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004ab92a [ 225.648722] RDX: 00007ffde4ea5820 RSI: 00000000025566e0 RDI: 00000000ffffff9c [ 225.649327] RBP: 0000000000002405 R08: 0000000001000000 R09: 000000000064b8f8 [ 225.649927] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 [ 225.650532] R13: 0000000000000004 R14: 0000000000008124 R15: 0000000000000004 [ 225.651128]

appendix: appendix.tar.gz

sfjro commented 1 year ago

Hello Siri-gl,

Siri-gl:

I made AUFS patch on kernel6.1=EF=BC=9A=0D patch -p1 < ../kernel/aufs-standalone-aufs6.1/aufs6-kbuild.patch=0D patch -p1 < ../kernel/aufs-standalone-aufs6.1/aufs6-base.patch=0D patch -p1 < ../kernel/aufs-standalone-aufs6.1/aufs6-mmap.patch=0D cp -a ../kernel/aufs-standalone-aufs6.1/{Documentation,fs} ./=0D cp ../kernel/aufs-standalone-aufs6.1/include/uapi/linux/aufs_type.h i= nclude/uapi/linux/aufs_type.h=0D Then operate in initrd=EF=BC=9A=0D mount /mnt/system/rootfs.squashfs /mnt/squashfs=0D mount -t tmpfs tmpfs /mnt/aufs=0D mount -t aufs -o dirs=3D/mnt/aufs=3Drw:/mnt/squashfs=3Dro aufs /mnt/a= ufs=0D cp -a /mnt/system/rootfs.squashfs /mnt/aufs/=0D =0D [ 82.806082] aufs test_add:291:mount[209]: uid/gid/perm /mnt/squashfs = 0/0/0755, 0/0/01777=0D [ 225.629963] BUG: looking up invalid subclass: 9=0D :::

Thanx for the report. So you are using aufs with CONFIG_LOCKDEP enabled. Up to aufs5.12, aufs5-standalone.git had a small patch called 'lockdep-debug.patch.' But it was removed since aufs5.13, because I misunderstood it wasn't necessary anymore. Now I understand I was wrong and at least MAX_LOCKDEP_SUBCLASSES had to be increased. Here is a patch for you. If it works well, I will put it into aufs[56]-standalone.git as new lockdep-debug.patch.

J. R. Okajima

diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h index d22430840b53..83a70b8f826a 100644 --- a/include/linux/lockdep_types.h +++ b/include/linux/lockdep_types.h @@ -12,7 +12,7 @@

include <linux/types.h>

-#define MAX_LOCKDEP_SUBCLASSES 8UL +#define MAX_LOCKDEP_SUBCLASSES (8UL + 4)

enum lockdep_wait_type { LD_WAIT_INV = 0, / not checked, catch all /

Siri-gl commented 1 year ago

Hello, The previous error disappeared, another error occurs when I start kde plasma desktop via switch_root Jan 3 14:59:10 localhost kernel: [ 286.112865] BUG: kernel NULL pointer dereference, address: 0000000000000078 Jan 3 14:59:10 localhost kernel: [ 286.112870] #PF: supervisor read access in kernel mode Jan 3 14:59:10 localhost kernel: [ 286.112871] #PF: error_code(0x0000) - not-present page Jan 3 14:59:10 localhost kernel: [ 286.112873] PGD 364d4d067 P4D 364d4d067 PUD 364d4a067 PMD 0 Jan 3 14:59:10 localhost kernel: [ 286.112876] Oops: 0000 [#1] PREEMPT SMP NOPTI Jan 3 14:59:10 localhost kernel: [ 286.112878] CPU: 1 PID: 932 Comm: startplasma-x11 Tainted: G E 6.1.0 #3 Jan 3 14:59:10 localhost kernel: [ 286.112880] Hardware name: LENOVO 20VD/LNVNB161216, BIOS F8CN42WW(V2.05) 06/28/2021 Jan 3 14:59:10 localhost kernel: [ 286.112881] RIP: 0010:apparmor_file_open+0x99/0x310 Jan 3 14:59:10 localhost kernel: [ 286.112885] Code: 00 00 00 0f 85 87 02 00 00 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 48 8b 97 70 01 00 00 48 63 05 67 cd 2f 01 <48> 8b 52 78 48 8b 1c 02 48 85 db 0f 84 e6 00 00 00 f6 43 41 08 75 Jan 3 14:59:10 localhost kernel: [ 286.112887] RSP: 0018:ffff99f381c23af0 EFLAGS: 00010246 Jan 3 14:59:10 localhost kernel: [ 286.112889] RAX: 0000000000000000 RBX: ffffffff856fca30 RCX: 0000000000000000 Jan 3 14:59:10 localhost kernel: [ 286.112890] RDX: 0000000000000000 RSI: ffffffff855af64a RDI: ffff8be5c10f4800 Jan 3 14:59:10 localhost kernel: [ 286.112891] RBP: ffff99f381c23b30 R08: 0000000000000000 R09: 0000000000000001 Jan 3 14:59:10 localhost kernel: [ 286.112893] R10: ffff8be828e40e58 R11: ffff8be84af59b45 R12: ffff8be5c10f4800 Jan 3 14:59:10 localhost kernel: [ 286.112894] R13: ffff8be81d7d4500 R14: 0000000000000000 R15: ffff8be5c10f4810 Jan 3 14:59:10 localhost kernel: [ 286.112895] FS: 00007f4c135a1900(0000) GS:ffff8be95fa40000(0000) knlGS:0000000000000000 Jan 3 14:59:10 localhost kernel: [ 286.112896] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 3 14:59:10 localhost kernel: [ 286.112897] CR2: 0000000000000078 CR3: 0000000364fd0004 CR4: 0000000000770ee0 Jan 3 14:59:10 localhost kernel: [ 286.112899] PKRU: 55555554 Jan 3 14:59:10 localhost kernel: [ 286.112900] Call Trace: Jan 3 14:59:10 localhost kernel: [ 286.112901] Jan 3 14:59:10 localhost kernel: [ 286.112904] security_file_open+0x2e/0x60 Jan 3 14:59:10 localhost kernel: [ 286.112908] do_dentry_open+0x104/0x430 Jan 3 14:59:10 localhost kernel: [ 286.112911] finish_open+0x1c/0x30 Jan 3 14:59:10 localhost kernel: [ 286.112913] shmem_tmpfile+0x9a/0xb0 Jan 3 14:59:10 localhost kernel: [ 286.112916] vfs_tmpfile+0xc0/0x180 Jan 3 14:59:10 localhost kernel: [ 286.112918] ? alloc_file+0xb9/0x110 Jan 3 14:59:10 localhost kernel: [ 286.112920] vfs_tmpfile_open+0x40/0x80 Jan 3 14:59:10 localhost kernel: [ 286.112922] aufs_tmpfile+0x1d1/0x4e0 Jan 3 14:59:10 localhost kernel: [ 286.112925] vfs_tmpfile+0xc0/0x180 Jan 3 14:59:10 localhost kernel: [ 286.112927] path_openat+0x85b/0xcc0 Jan 3 14:59:10 localhost kernel: [ 286.112929] ? mod_objcg_state+0x166/0x470 Jan 3 14:59:10 localhost kernel: [ 286.112932] do_filp_open+0xb4/0x160 Jan 3 14:59:10 localhost kernel: [ 286.112934] ? _raw_spin_unlock+0x2c/0x50 Jan 3 14:59:10 localhost kernel: [ 286.112937] ? alloc_fd+0xc0/0x160 Jan 3 14:59:10 localhost kernel: [ 286.112940] do_sys_openat2+0x9a/0x160 Jan 3 14:59:10 localhost kernel: [ 286.112942] x64_sys_openat+0x6c/0xa0 Jan 3 14:59:10 localhost kernel: [ 286.112945] do_syscall_64+0x37/0x90 Jan 3 14:59:10 localhost kernel: [ 286.112947] entry_SYSCALL_64_after_hwframe+0x63/0xcd Jan 3 14:59:10 localhost kernel: [ 286.112949] RIP: 0033:0x7f4c17c9f4e4 Jan 3 14:59:10 localhost kernel: [ 286.112951] Code: f0 25 00 00 41 00 3d 00 00 41 00 74 49 64 8b 04 25 18 00 00 00 85 c0 75 6d 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 90 00 00 00 48 8b 54 24 28 64 48 2b 14 25 Jan 3 14:59:10 localhost kernel: [ 286.112952] RSP: 002b:00007ffea7c9fad0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 Jan 3 14:59:10 localhost kernel: [ 286.112954] RAX: ffffffffffffffda RBX: 0000000000490002 RCX: 00007f4c17c9f4e4 Jan 3 14:59:10 localhost kernel: [ 286.112955] RDX: 0000000000490002 RSI: 000000000126f8f8 RDI: 00000000ffffff9c Jan 3 14:59:10 localhost kernel: [ 286.112956] RBP: 000000000126f8f8 R08: 0000000000000007 R09: 000000000126fbc0 Jan 3 14:59:10 localhost kernel: [ 286.112957] R10: 00000000000001b6 R11: 0000000000000246 R12: 000000000126f820 Jan 3 14:59:10 localhost kernel: [ 286.112959] R13: 0000000000000017 R14: 000000000126fb20 R15: 0000000000000001 Jan 3 14:59:10 localhost kernel: [ 286.112961]

syslog: syslog.tar.gz

sfjro commented 1 year ago

Siri-gl:

Jan 3 14:59:10 localhost kernel: [ 286.112865] BUG: kernel NULL poin= ter dereference, address: 0000000000000078=0D ::: Jan 3 14:59:10 localhost kernel: [ 286.112881] RIP: 0010:apparmorfile= open+0x99/0x310=0D

Thanx for the report. I think I could reproduce the problem on my test machine. Give me some time coz I am concentrating another issue.

J. R. Okajima

sfjro commented 1 year ago

Siri-gl:

Jan 3 14:59:10 localhost kernel: [ 286.112865] BUG: kernel NULL poin= ter dereference, address: 0000000000000078=0D ::: Jan 3 14:59:10 localhost kernel: [ 286.112881] RIP: 0010:apparmorfile= open+0x99/0x310=0D

Here is a patch for you, but this is not tested. Reviewing the code, I think the bug was born in the commit ba444833ee27 2022-10-27 aufs: for v6.1-rc1, O_TMPFILE and aufs should not pass NULL as the parameter 'cred'.

If it works well on your side and passes my local tests (maybe next week), then the patch will be merged in aufs release.

J. R. Okajima

diff --git a/fs/aufs/i_op_add.c b/fs/aufs/i_op_add.c index 37c3fb490877..7b49567215aa 100644 --- a/fs/aufs/i_op_add.c +++ b/fs/aufs/i_op_add.c @@ -473,7 +473,7 @@ int aufs_tmpfile(struct user_namespace userns, struct inode dir, h_ppath.mnt = h_mnt; h_ppath.dentry = h_parent; h_file = vfs_tmpfile_open(h_userns, &h_ppath, mode, /open_flag/0,

Siri-gl commented 1 year ago

sfjro: Thanks, work well now, I will do more test!

sfjro commented 1 year ago

------- Blind-Carbon-Copy

From: "J. R. Okajima" @.> To: @. Subject: aufs5 and aufs6 GIT release (v6.2-r1) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: @.> Date: Mon, 09 Jan 2023 01:46:19 +0900 Message-ID: @.>

o bugfix

o misc.

J. R. Okajima


------- End of Blind-Carbon-Copy

Siri-gl commented 1 year ago

sfjro: worl well for a week,I think it has been solved,thanks.