Open-CAS / open-cas-linux

Open CAS Linux
https://open-cas.com
BSD 3-Clause "New" or "Revised" License
216 stars 82 forks source link

Kernel OOPS on kernel 6.0.0-6 (Debian Bookworm) #1433

Closed nfournil closed 1 year ago

nfournil commented 1 year ago

Hello,

Compilation and deb creation are ok, but when you load module you get oops and a terrible performance (1k IOPS max on NVMe)

[  539.021069] cas_cache: loading out-of-tree module taints kernel.
[  539.021428] cas_cache: module verification failed: signature and/or required key missing - tainting kernel
[  539.481967] 'OCF_Core' volume operations registered
[  539.501989] 'OCF_Cache' volume operations registered
[  539.521217] 'OCF Composite' volume operations registered
[  539.873511] 'Block_Device' volume operations registered
[  539.873744] Open Cache Acceleration Software Linux Version 22.06.0.0838.master (6.0.0-6-amd64)::Module loaded successfully
[  713.871748] Inserting cache cache1
[  713.872759] cache1: Metadata initialized
[  713.873015] cache1: Successfully added
[  713.873017] cache1: Cache mode : wt
[  713.873574] Thread cas_io_1_0 started
[  713.874157] Thread cas_io_1_1 started
...
[  713.907585] Thread cas_io_1_100 started
[  713.907944] Thread cas_io_1_101 started
[  713.908357] Thread cas_io_1_102 started
[  713.908719] Thread cas_io_1_103 started
[  713.909025] Thread cas_mngt_1 started
[  713.909037] BUG: using smp_processor_id() in preemptible [00000000] code: casadm/68510
[  713.909090] caller is cas_rpool_try_get+0x1b/0xb0 [cas_cache]
[  713.909163] CPU: 24 PID: 68510 Comm: casadm Tainted: G           OE      6.0.0-6-amd64 #1  Debian 6.0.12-1
[  713.909173] Hardware name: Lenovo ThinkSystem SR650 V2/7Z73CTO1WW, BIOS AFE120G-1.40 09/20/2022
[  713.909176] Call Trace:
[  713.909183]  <TASK>
[  713.909189]  dump_stack_lvl+0x44/0x5c
[  713.909208]  check_preemption_disabled+0xe1/0xf0
[  713.909221]  cas_rpool_try_get+0x1b/0xb0 [cas_cache]
[  713.909261]  env_allocator_new+0x39/0xd0 [cas_cache]
[  713.909299]  env_mpool_new+0x5a/0x90 [cas_cache]
[  713.909336]  ocf_req_new+0x7a/0x210 [cas_cache]
[  713.909385]  ocf_pipeline_create+0x48/0xb0 [cas_cache]
[  713.909427]  ? _cache_mngt_core_flush_complete+0x80/0x80 [cas_cache]
[  713.909470]  ocf_mngt_cache_attach+0xc4/0x1e0 [cas_cache]
[  713.909521]  cache_mngt_init_instance+0x692/0x710 [cas_cache]
[  713.909563]  cas_service_ioctl_ctrl+0x2297/0x27ce [cas_cache]
[  713.909603]  __x64_sys_ioctl+0x8d/0xd0
[  713.909616]  do_syscall_64+0x37/0xc0
[  713.909623]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  713.909634] RIP: 0033:0x7fadf6261bab
[  713.909640] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[  713.909646] RSP: 002b:00007ffd81abb530 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  713.909655] RAX: ffffffffffffffda RBX: 00007ffd81abb630 RCX: 00007fadf6261bab
[  713.909659] RDX: 00007ffd81abb720 RSI: ffffffffd030ba29 RDI: 0000000000000004
[  713.909663] RBP: 0000000000000003 R08: 0000000000000000 R09: 00007ffd81abb4f7
[  713.909666] R10: 0000000000000008 R11: 0000000000000246 R12: ffffffffd030ba29
[  713.909670] R13: 00007ffd81abb720 R14: 0000000000000004 R15: 00007ffd81abb590
[  713.909677]  </TASK>

`

nfournil commented 1 year ago

Insert FULL log (too many cores threads, each one crash ! ) with cache init and module remove (ok) opencas_kernelcrash_full_log.txt

nfournil commented 1 year ago

Related to non prehemptible capability of OpenCas. Had to build a "custom" kernel.

Duplicate of : https://github.com/Open-CAS/open-cas-linux/issues/1414