open-iscsi / targetcli-fb

Command shell for managing Linux LIO kernel target
Apache License 2.0
103 stars 69 forks source link

iSCSI targets are blocked and do not recover until system reboot #195

Closed kjavheri closed 1 year ago

kjavheri commented 1 year ago

we are running targetlci v2.1.53 on CentOS Stream release 8 with kernel v4.18.0-394. we are getting below messages in /var/log/messages every 3-4 seconds.


kernel: iSCSI Login timeout on Network Portal 10.74.67.11:3260
kernel: iSCSI Login negotiation failed.

After this message, if we try to restart the target service, it goes into hang state and doesn't restart for very long time. we can't even kill the process. Final workaround is to restart the target VM to make the service normal again.

we also found some call traces in the /var/log/messages as below.


Dec 28 05:49:07 kernel: iSCSI Login timeout on Network Portal 10.74.67.11:3260
Dec 28 05:49:07 kernel: iSCSI Login negotiation failed.
Dec 28 05:49:23 kernel: iSCSI Login timeout on Network Portal 10.74.67.11:3260
Dec 28 05:49:23 kernel: tx_data returned -104, expecting 48.
Dec 28 05:49:23 kernel: iSCSI Login negotiation failed.
Dec 28 05:49:26 kernel: INFO: task kworker/0:2:110640 blocked for more than 120 seconds.
Dec 28 05:49:26 kernel:      Not tainted 4.18.0-394.el8.x86_64 #1
Dec 28 05:49:26 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 28 05:49:26 kernel: task:kworker/0:2     state:D stack:    0 pid:110640 ppid:     2 flags:0x80004080
Dec 28 05:49:26 kernel: Workqueue: events target_tmr_work [target_core_mod]
Dec 28 05:49:26 kernel: Call Trace:
Dec 28 05:49:26 kernel: __schedule+0x2d1/0x830
Dec 28 05:49:26 kernel: ? __switch_to_asm+0x35/0x70
Dec 28 05:49:26 kernel: ? __switch_to_asm+0x41/0x70
Dec 28 05:49:26 kernel: schedule+0x35/0xa0
Dec 28 05:49:26 kernel: schedule_timeout+0x274/0x300
Dec 28 05:49:26 kernel: ? __schedule+0x2d9/0x830
Dec 28 05:49:26 kernel: ? __irq_work_queue_local+0x4b/0x50
Dec 28 05:49:26 kernel: ? irq_work_queue+0x16/0x20
Dec 28 05:49:26 kernel: wait_for_completion+0x96/0x100
Dec 28 05:49:26 kernel: target_put_cmd_and_wait+0x6e/0x90 [target_core_mod]
Dec 28 05:49:26 kernel: core_tmr_abort_task.cold.7+0x197/0x212 [target_core_mod]
Dec 28 05:49:26 kernel: target_tmr_work+0x96/0x110 [target_core_mod]
Dec 28 05:49:26 kernel: process_one_work+0x1a7/0x360
Dec 28 05:49:26 kernel: worker_thread+0x30/0x390
Dec 28 05:49:26 kernel: ? create_worker+0x1a0/0x1a0
Dec 28 05:49:26 kernel: kthread+0x10a/0x120
Dec 28 05:49:26 kernel: ? set_kthread_struct+0x40/0x40
Dec 28 05:49:26 kernel: ret_from_fork+0x35/0x40

This issue is reproducible after simple test of turning off network between initiator and target VM.

maurizio-lombardi commented 1 year ago

Hello, this is a kernel issue and a new bugzilla should be opened against Red Hat's Centos Stream. targetcli has no control over kernel target's internal processing.

kjavheri commented 1 year ago

Thanks for your prompt reply, @maurizio-lombardi, is this a new issue in itself that we need to report to Red Hat or there is an existing defect around it? Any supporting document available?

o Is this a known kernel issue? o If so, what kernel version are affected (and which ones are not)?

kjavheri commented 1 year ago

Apparently targetcli on debian is also facing issue!

maurizio-lombardi commented 1 year ago

@kjavheri Sorry I missed your previous message, I will look at this next Monday

maurizio-lombardi commented 1 year ago

This looks like a known issue in the target driver (not in targetcli). Is it possible for you to test this patch? http://bsdbackstore.eu/misc/0001-target-iscsi-handle-abort-for-WRITE_PENDING-cmds.patch

hbhutkar commented 1 year ago

Hi @maurizio-lombardi,

I'm working with @kjavheri, I was trying this patch but getting some errors while compiling kernel.

Stack dump:
0.      Program arguments: clang -g -D__TARGET_ARCH_x86 -mlittle-endian -I/var/rpmbuild/BUILD/kernel-4.18.0-394.el8/linux-4.18.0-394.el8.local.x86_64/tools/testing/selftests/bpf/tools/include -I/var/rpmbuild/BUILD/kernel-4.18.0-394.el8/linux-4.18.0-394.el8.local.x86_64/tools/testing/selftests/bpf -I/var/rpmbuild/BUILD/kernel-4.18.0-394.el8/linux-4.18.0-394.el8.local.x86_64/tools/include/uapi -I/var/rpmbuild/BUILD/kernel-4.18.0-394.el8/linux-4.18.0-394.el8.local.x86_64/tools/testing/selftests/usr/include -idirafter /usr/lib64/clang/15.0.7/include -idirafter /usr/local/include -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 -target bpf -c progs/pyperf600.c -o /var/rpmbuild/BUILD/kernel-4.18.0-394.el8/linux-4.18.0-394.el8.local.x86_64/tools/testing/selftests/bpf/pyperf600.o -mcpu=v3
1.      <eof> parser at end of file
2.      Code generation
 #0 0x00007fd68d70d490 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/lib64/libLLVM-15.so+0xc55490)
 #1 0x00007fd68d70b024 llvm::sys::RunSignalHandlers() (/lib64/libLLVM-15.so+0xc53024)
 #2 0x00007fd68d70b9d9 llvm::sys::CleanupOnSignal(unsigned long) (/lib64/libLLVM-15.so+0xc539d9)
 #3 0x00007fd68d61812b llvm::CrashRecoveryContext::HandleExit(int) (/lib64/libLLVM-15.so+0xb6012b)
 #4 0x00007fd68d704232 llvm::sys::Process::Exit(int, bool) (/lib64/libLLVM-15.so+0xc4c232)
 #5 0x000055b9fb54fe4f (/usr/bin/clang-15+0x14e4f)
 #6 0x00007fd68d62bf83 llvm::report_fatal_error(llvm::Twine const&, bool) (/lib64/libLLVM-15.so+0xb73f83)
 #7 0x00007fd68d62c11e (/lib64/libLLVM-15.so+0xb7411e)
 #8 0x00007fd690653340 (/lib64/libLLVM-15.so+0x3b9b340)
 #9 0x00007fd68efc152a llvm::MCAssembler::layout(llvm::MCAsmLayout&) (/lib64/libLLVM-15.so+0x250952a)
#10 0x00007fd68efc161c llvm::MCAssembler::Finish() (/lib64/libLLVM-15.so+0x250961c)
#11 0x00007fd68efe4eeb llvm::MCELFStreamer::finishImpl() (/lib64/libLLVM-15.so+0x252ceeb)
#12 0x00007fd68f009cad llvm::MCStreamer::finish(llvm::SMLoc) (/lib64/libLLVM-15.so+0x2551cad)
#13 0x00007fd68e07af9b llvm::AsmPrinter::doFinalization(llvm::Module&) (/lib64/libLLVM-15.so+0x15c2f9b)
#14 0x00007fd68d84be34 llvm::FPPassManager::doFinalization(llvm::Module&) (/lib64/libLLVM-15.so+0xd93e34)
#15 0x00007fd68d856650 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/lib64/libLLVM-15.so+0xd9e650)
#16 0x00007fd69496997d clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) (/lib64/libclang-cpp.so.15+0x199c97d)
#17 0x00007fd694cdb7ec (/lib64/libclang-cpp.so.15+0x1d0e7ec)
#18 0x00007fd693a943d1 clang::ParseAST(clang::Sema&, bool, bool) (/lib64/libclang-cpp.so.15+0xac73d1)
#19 0x00007fd694cdae1d clang::CodeGenAction::ExecuteAction() (/lib64/libclang-cpp.so.15+0x1d0de1d)
#20 0x00007fd6955c6a51 clang::FrontendAction::Execute() (/lib64/libclang-cpp.so.15+0x25f9a51)
#21 0x00007fd69555a83b clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/lib64/libclang-cpp.so.15+0x258d83b)
#22 0x00007fd69564698c clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/lib64/libclang-cpp.so.15+0x267998c)
#23 0x000055b9fb55138c cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/bin/clang-15+0x1638c)
#24 0x000055b9fb54a99b (/usr/bin/clang-15+0xf99b)
#25 0x00007fd695247479 (/lib64/libclang-cpp.so.15+0x227a479)
#26 0x00007fd68d617fd7 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/lib64/libLLVM-15.so+0xb5ffd7)
#27 0x00007fd695247c65 (/lib64/libclang-cpp.so.15+0x227ac65)
#28 0x00007fd695217ced clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/lib64/libclang-cpp.so.15+0x224aced)
#29 0x00007fd6952186ee clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/lib64/libclang-cpp.so.15+0x224b6ee)
#30 0x00007fd6952268d6 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/lib64/libclang-cpp.so.15+0x22598d6)
#31 0x000055b9fb54f00e clang_main(int, char**) (/usr/bin/clang-15+0x1400e)
#32 0x00007fd68bdfed85 __libc_start_main (/lib64/libc.so.6+0x3ad85)
#33 0x000055b9fb54a4de _start (/usr/bin/clang-15+0xf4de)
clang-15: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 15.0.7 (Red Hat 15.0.7-1.module_el8.8.0+1258+af79b238)
Target: bpf
Thread model: posix
InstalledDir: /usr/bin
clang-15: note: diagnostic msg:
********************

Trying this patch on Kernel version is 4.18.0-394 on CentOS Stream release 8. I'm following these steps to apply patch and compile kernel. Can you please help us here? It'll be very helpful if you tell us correct way in case we are missing any steps.

maurizio-lombardi commented 1 year ago

@hbhutkar Let me prepare a patched 4.18.0-394 for you, I will send a link with the RPMs

maurizio-lombardi commented 1 year ago

@hbhutkar here you can download the 4.18.0-394 kernel RPMs that include the patch I mentioned: http://bsdbackstore.eu/misc/i195/

hbhutkar commented 1 year ago

Thanks @maurizio-lombardi very much. I will try this patch and let you know.

hbhutkar commented 1 year ago

Hi @maurizio-lombardi, We did some initial testing with these kernel RPMs, and it looks working and resolving our problem. The "target" service doesn't go into hang state anymore after network termination and restoring between initiator and target vm. We will be doing some more testing around this and let you know if any issue occurs. We are also facing similar issue for debian based targetcli and looking for solution to that as well.Would compiling this patch for debian based kernel also work?

For reference pasting the trace from debian 11 (bullseye) system

hdrsadmin@hdrs-iscsitarget:~$ uname -r
5.10.0-21-amd64
hdrsadmin@hdrs-iscsitarget:~$

Trace logs:

Feb  2 01:01:32 deb-iscsi-target kernel: [752213.494764] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 117457408
Feb  2 01:01:32 deb-iscsi-target kernel: [752213.749502] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 67109376
Feb  2 01:01:54 deb-iscsi-target kernel: [752235.222931] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 134262016
Feb  2 01:01:54 deb-iscsi-target kernel: [752235.222934] ABORT_TASK: Found referenced iSCSI task_tag: 67143680
Feb  2 01:01:57 deb-iscsi-target kernel: [752238.236759] ABORT_TASK: Found referenced iSCSI task_tag: 100689152
Feb  2 01:02:05 deb-iscsi-target kernel: [752246.233825] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 83903232
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920197] task:iscsi_trx       state:D stack:    0 pid:14781 ppid:     2 flags:0x00004000
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920200] Call Trace:
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920213]  __schedule+0x282/0x870
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920234]  ? usleep_range+0x90/0x90
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920236]  schedule+0x46/0xb0
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920237]  schedule_timeout+0x8b/0x150
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920283]  ? target_show_cmd+0x29d/0x420 [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920286]  ? __next_timer_interrupt+0x110/0x110
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920287]  __wait_for_common+0xb7/0x170
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920294]  __transport_wait_for_tasks+0xb7/0x120 [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920301]  transport_generic_free_cmd+0xf2/0x1f0 [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920329]  iscsit_free_cmd+0x55/0xc0 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920343]  iscsit_close_connection+0x5d2/0x990 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920348]  iscsit_take_action_for_connection_exit+0x7e/0x100 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920353]  iscsi_target_rx_thread+0xb1/0x100 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920357]  ? iscsi_target_tx_thread+0x200/0x200 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920360]  kthread+0x118/0x140
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920364]  ? __kthread_bind_mask+0x60/0x60
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920366]  ret_from_fork+0x1f/0x30
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920381] task:kworker/0:1     state:D stack:    0 pid:14896 ppid:     2 flags:0x00004000
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920392] Workqueue: events target_tmr_work [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920393] Call Trace:
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920395]  __schedule+0x282/0x870
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920396]  ? usleep_range+0x90/0x90
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920398]  schedule+0x46/0xb0
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920399]  schedule_timeout+0x107/0x150
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920402]  ? __prepare_to_swait+0x4f/0x70
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920403]  __wait_for_common+0xb7/0x170
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920409]  target_put_cmd_and_wait+0x6d/0xb0 [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920417]  core_tmr_abort_task.cold+0x1a8/0x26a [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920424]  target_tmr_work+0xa3/0xf0 [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920426]  process_one_work+0x1b3/0x350
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920429]  worker_thread+0x53/0x3e0
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920430]  ? process_one_work+0x350/0x350
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920431]  kthread+0x118/0x140
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920432]  ? __kthread_bind_mask+0x60/0x60
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920433]  ret_from_fork+0x1f/0x30
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920437] task:kworker/0:0     state:D stack:    0 pid:14901 ppid:     2 flags:0x00004000
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920444] Workqueue: events target_tmr_work [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920445] Call Trace:
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920447]  __schedule+0x282/0x870
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920448]  ? usleep_range+0x90/0x90
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920449]  schedule+0x46/0xb0
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920450]  schedule_timeout+0x107/0x150
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920451]  ? __prepare_to_swait+0x4f/0x70
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920452]  __wait_for_common+0xb7/0x170
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920457]  target_put_cmd_and_wait+0x6d/0xb0 [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920465]  core_tmr_abort_task.cold+0x1a8/0x26a [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920471]  target_tmr_work+0xa3/0xf0 [target_core_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920472]  process_one_work+0x1b3/0x350
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920473]  worker_thread+0x53/0x3e0
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920474]  ? process_one_work+0x350/0x350
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920475]  kthread+0x118/0x140
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920476]  ? __kthread_bind_mask+0x60/0x60
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920477]  ret_from_fork+0x1f/0x30
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920480] task:kworker/0:3     state:D stack:    0 pid:14902 ppid:     2 flags:0x00004000
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920486] Workqueue: events iscsi_target_do_login_rx [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920486] Call Trace:
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920488]  __schedule+0x282/0x870
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920489]  ? usleep_range+0x90/0x90
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920493]  schedule+0x46/0xb0
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920494]  schedule_timeout+0x107/0x150
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920498]  ? iscsi_decode_text_input+0x2fc/0x810 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920499]  ? __prepare_to_swait+0x4f/0x70
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920500]  __wait_for_common+0xb7/0x170
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920504]  iscsi_check_for_session_reinstatement+0x1ef/0x290 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920512]  iscsi_target_do_login+0x2e4/0x580 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920517]  iscsi_target_do_login_rx+0x2c9/0x3f0 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920522]  ? iscsi_target_restore_sock_callbacks+0xc0/0xc0 [iscsi_target_mod]
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920522]  process_one_work+0x1b3/0x350
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920523]  worker_thread+0x53/0x3e0
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920524]  ? process_one_work+0x350/0x350
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920525]  kthread+0x118/0x140
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920526]  ? __kthread_bind_mask+0x60/0x60
Feb  2 01:04:56 deb-iscsi-target kernel: [752417.920527]  ret_from_fork+0x1f/0x30
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750412] task:kworker/0:1     state:D stack:    0 pid:14896 ppid:     2 flags:0x00004000
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750457] Workqueue: events target_tmr_work [target_core_mod]
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750461] Call Trace:
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750468]  __schedule+0x282/0x870
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750470]  ? usleep_range+0x90/0x90
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750471]  schedule+0x46/0xb0
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750472]  schedule_timeout+0x107/0x150
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750476]  ? __prepare_to_swait+0x4f/0x70
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750477]  __wait_for_common+0xb7/0x170
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750484]  target_put_cmd_and_wait+0x6d/0xb0 [target_core_mod]
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750492]  core_tmr_abort_task.cold+0x1a8/0x26a [target_core_mod]
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750498]  target_tmr_work+0xa3/0xf0 [target_core_mod]
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750500]  process_one_work+0x1b3/0x350
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750506]  worker_thread+0x53/0x3e0
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750506]  ? process_one_work+0x350/0x350
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750508]  kthread+0x118/0x140
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750509]  ? __kthread_bind_mask+0x60/0x60
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750512]  ret_from_fork+0x1f/0x30
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750516] task:kworker/0:0     state:D stack:    0 pid:14901 ppid:     2 flags:0x00004000
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750521] Workqueue: events target_tmr_work [target_core_mod]
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750522] Call Trace:
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750524]  __schedule+0x282/0x870
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750525]  ? usleep_range+0x90/0x90
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750526]  schedule+0x46/0xb0
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750527]  schedule_timeout+0x107/0x150
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750528]  ? __prepare_to_swait+0x4f/0x70
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750529]  __wait_for_common+0xb7/0x170
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750534]  target_put_cmd_and_wait+0x6d/0xb0 [target_core_mod]
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750540]  core_tmr_abort_task.cold+0x1a8/0x26a [target_core_mod]
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750545]  target_tmr_work+0xa3/0xf0 [target_core_mod]
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750546]  process_one_work+0x1b3/0x350
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750547]  worker_thread+0x53/0x3e0
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750548]  ? process_one_work+0x350/0x350
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750549]  kthread+0x118/0x140
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750550]  ? __kthread_bind_mask+0x60/0x60
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750551]  ret_from_fork+0x1f/0x30
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750553] task:kworker/0:3     state:D stack:    0 pid:14902 ppid:     2 flags:0x00004000
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750565] Workqueue: events iscsi_target_do_login_rx [iscsi_target_mod]
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750566] Call Trace:
Feb  2 01:06:57 deb-iscsi-target kernel: [752538.750567]  __schedule+0x282/0x870
maurizio-lombardi commented 1 year ago

@hbhutkar

We are also facing similar issue for debian based targetcli and looking for solution to that as well. Would compiling this patch for debian based kernel also work?

Yes, the patch would work on Debian's kernel too. But you will have to patch and compile the Debian kernel by yourself because I have little experience with that distro.

kjavheri commented 1 year ago

@maurizio-lombardi While we tested this as a private patch on our CentOS stream 8 and find it to be working, can you let me know whether this can be done as upstream patch or any kernel patch? Since this appears to fix the problem, could we get a upstream patch that could resolve this via an update? Is there any plan that an upcoming release fixes this?

kjavheri commented 1 year ago

Hi @maurizio-lombardi any comments or thoughts on my previous query?

maurizio-lombardi commented 1 year ago

@maurizio-lombardi While we tested this as a private patch on our CentOS stream 8 and find it to be working, can you let me know whether this can be done as upstream patch or any kernel patch? Since this appears to fix the problem, could we get a upstream patch that could resolve this via an update? Is there any plan that an upcoming release fixes this?

Hello, some days ago I asked the original developer of the patch if he plans to send the patch for review and inclusion but he didn't reply back. Let me ping him again.

maurizio-lombardi commented 1 year ago

@kjavheri The patch is going to be merged soon in the mainline Linux kernel: https://patchwork.kernel.org/project/target-devel/patch/20230319015620.96006-10-michael.christie@oracle.com/

kjavheri commented 1 year ago

Thank you so much for the reply and help @maurizio-lombardi.

kjavheri commented 1 year ago

Hi @maurizio-lombardi, The state of the patch to merge in mainline Linux kernel is still 'Accepted'. Do you know if there is any timeline when this will be available?

gonzoleeman commented 1 year ago

On May 1, 2023, at 11:14 PM, kjavheri @.***> wrote:

Hi @maurizio-lombardi https://github.com/maurizio-lombardi, The state of the patch to merge in mainline Linux kernel is still 'Accepted'. Do you know if there is any timeline when this will be available?

It looks like this patch was merged upstream at the end of March.

Of course if you’re using a distribution, then you’ll either have to wait until they backport this, or you will have to do it yourself. — Lee Duncan

cjuniorfox commented 1 year ago

I'm facing the same issue. Debian 12

$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 12 (bookworm)
Release:    12
Codename:   bookworm
$ uname -a
6.1.0-11-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-4 (2023-08-08) x86_64 GNU/Linux

The ISCSI targets stop working and I can't recover it until a system reboots. I tried to restart "targetclid" service. Not resolved. Tried removing the portal through targetcli for re-adding again after, but the utility hangs forever:

/iscsi/iqn.20...8f2bb/tpg1> portals/ delete 0.0.0.0 3260
$ dmesg
[ 1209.465365] task:iscsi_ttx       state:D stack:0     pid:3399  ppid:2      flags:0x00004004
[ 1209.465374] Call Trace:
[ 1209.465378]  <TASK>
[ 1209.465383]  __schedule+0x351/0xa20
[ 1209.465391]  schedule+0x5d/0xe0
[ 1209.465398]  schedule_timeout+0x94/0x150
[ 1209.465406]  ? __bpf_trace_tick_stop+0x10/0x10
[ 1209.465415]  wait_for_completion_timeout+0x7f/0x170
[ 1209.465425]  __transport_wait_for_tasks+0xb3/0x100 [target_core_mod]
[ 1209.465472]  transport_generic_free_cmd+0x12c/0x1a0 [target_core_mod]
[ 1209.465518]  iscsit_free_cmd+0x53/0xc0 [iscsi_target_mod]
[ 1209.465555]  iscsit_close_connection+0x583/0x980 [iscsi_target_mod]
[ 1209.465591]  ? iscsit_get_rx_pdu+0xfc0/0xfc0 [iscsi_target_mod]
[ 1209.465624]  iscsit_take_action_for_connection_exit+0x7c/0x100 [iscsi_target_mod]
[ 1209.465660]  iscsi_target_tx_thread+0x16a/0x230 [iscsi_target_mod]
[ 1209.465694]  ? dequeue_task_stop+0x70/0x70
[ 1209.465703]  kthread+0xe9/0x110
[ 1209.465710]  ? kthread_complete_and_exit+0x20/0x20
[ 1209.465718]  ret_from_fork+0x22/0x30
[ 1209.465728]  </TASK>
cjuniorfox commented 1 year ago
$ dmesg --follow
[ 1209.465365] task:iscsi_ttx       state:D stack:0     pid:3399  ppid:2      flags:0x00004004
[ 1209.465374] Call Trace:
[ 1209.465378]  <TASK>
[ 1209.465383]  __schedule+0x351/0xa20
[ 1209.465391]  schedule+0x5d/0xe0
[ 1209.465398]  schedule_timeout+0x94/0x150
[ 1209.465406]  ? __bpf_trace_tick_stop+0x10/0x10
[ 1209.465415]  wait_for_completion_timeout+0x7f/0x170
[ 1209.465425]  __transport_wait_for_tasks+0xb3/0x100 [target_core_mod]
[ 1209.465472]  transport_generic_free_cmd+0x12c/0x1a0 [target_core_mod]
[ 1209.465518]  iscsit_free_cmd+0x53/0xc0 [iscsi_target_mod]
[ 1209.465555]  iscsit_close_connection+0x583/0x980 [iscsi_target_mod]
[ 1209.465591]  ? iscsit_get_rx_pdu+0xfc0/0xfc0 [iscsi_target_mod]
[ 1209.465624]  iscsit_take_action_for_connection_exit+0x7c/0x100 [iscsi_target_mod]
[ 1209.465660]  iscsi_target_tx_thread+0x16a/0x230 [iscsi_target_mod]
[ 1209.465694]  ? dequeue_task_stop+0x70/0x70
[ 1209.465703]  kthread+0xe9/0x110
[ 1209.465710]  ? kthread_complete_and_exit+0x20/0x20
[ 1209.465718]  ret_from_fork+0x22/0x30
[ 1209.465728]  </TASK>
$ ps 3399
    PID TTY      STAT   TIME COMMAND
   3399 ?        D      0:01 [iscsi_ttx]

The process got into an uninterruptible sleep state and can't be killed. So I only able to restart the computer entirely

maurizio-lombardi commented 1 year ago

@cjuniorfox The patch that should fix this issue has been merged in the 6.4 kernel, your debian system is running the 6.1 kernel so you would need to merge it yourself or ask a debian maintainer to backport it.

maurizio-lombardi commented 1 year ago

I think it's clear now that it was a kernel's problem and that it has been fixed, if a particular distro backports the patch or not is outside of my control, so I close this issue.