SCST-project / scst

SCST is a SCSI target software stack that allows to export any block device or file via iSCSI, FC or RDMA (SRP).
http://scst.sourceforge.net
88 stars 34 forks source link

SCST crashes with kernel Version > 6.1.91. Perhaps Changes in Lnx-Kernel 6.1.92 for XFS/iomap causing the problems? #250

Open Mike-Speedcracker opened 4 weeks ago

Mike-Speedcracker commented 4 weeks ago

Hi there,

I'm using SCST for a long time - so at first thanks for this nice piece of software.

The reason for this message is, with changes in Kernel 6.1.92, I've got some trouble using SCST. I think the reason for the kernel-dumps, which are mentioned later, are the commits (in Kernel 6.1.92): e811fec51c66a0056459daa1ac834aea7d8d98f5, ea67e73129fceffd40b9193da93544c34d81b9c2, 54a37e5d07478358dcbf6e73b6c7e40e50a6f375, 580f40b4c956f38e83f66ebed4d81bbe4a7d82fb, 12339ec6fe4d41e69a81a13ca5e1c443fbe5bcba... and so on.

With kernel version < 6.1.91 everything is working fine. Every kernel > 6.1.91 throws a kernel dump.

I'm using XFS as a underlaying file-system with a 32bit kernel. SCST is using the modules iscsi_scst scst_vdisk scst

I'm using the latest SCST git release (with the latest commit from 2024-08-19).

Here is the kernel-backtrace:

[Tue Aug 20 00:02:00 2024] ------------[ cut here ]------------ [Tue Aug 20 00:02:00 2024] WARNING: CPU: 5 PID: 2048 at fs/iomap/buffered-io.c:980 iomap_file_buffered_write_punch_delalloc+0x3b0/0x440 [Tue Aug 20 00:02:00 2024] Modules linked in: iscsi_scst(O) scst_vdisk(O) scst(O) dlm quota_v2 quota_tree autofs4 tcp_bbr sch_fq udf crc_itu_t input_leds led_class ses enclosure hid_generic wmi_bmof edac_mce_amd crc32_pclmul usbhid aesni_intel crypto_simd uas hid usb_storage rapl r8169 bnx2 i2c_piix4 mpt3sas ccp i2c_core sha1_generic k10temp pcspkr video wmi backlight [Tue Aug 20 00:02:00 2024] CPU: 5 PID: 2048 Comm: disk042_0 Tainted: G S O 6.1.106_LFS_FILE01 #1 [Tue Aug 20 00:02:00 2024] Hardware name: To Be Filled By O.E.M. X370 Pro4/X370 Pro4, BIOS P10.08 01/22/2024 [Tue Aug 20 00:02:00 2024] EIP: iomap_file_buffered_write_punch_delalloc+0x3b0/0x440 [Tue Aug 20 00:02:00 2024] Code: 26 00 89 c6 89 d8 e8 4f e3 ed ff f0 ff 4b 1c 75 9c 89 d8 e8 22 0a ef ff 66 90 eb 91 8d b6 00 00 00 00 0f 0b e9 f5 fd ff ff 90 <0f> 0b e9 df fd ff ff 90 0f 0b 8b 45 ec 8b 55 f0 89 f9 39 c6 19 d1 [Tue Aug 20 00:02:00 2024] EAX: a733e000 EBX: 0007f000 ECX: fffffef2 EDX: 0000010e [Tue Aug 20 00:02:00 2024] ESI: a733f000 EDI: 00000000 EBP: 86df3bd0 ESP: 86df3b80 [Tue Aug 20 00:02:00 2024] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010293 [Tue Aug 20 00:02:00 2024] CR0: 80050033 CR2: 37f87000 CR3: 0185a6e0 CR4: 00350ef0 [Tue Aug 20 00:02:00 2024] Call Trace: [Tue Aug 20 00:02:00 2024] ? show_regs.cold+0x16/0x1b [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x3b0/0x440 [Tue Aug 20 00:02:00 2024] ? warn+0x87/0xe0 [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x3b0/0x440 [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x3b0/0x440 [Tue Aug 20 00:02:00 2024] ? report_bug+0xe5/0x170 [Tue Aug 20 00:02:00 2024] ? exc_overflow+0x60/0x60 [Tue Aug 20 00:02:00 2024] ? handle_bug+0x2a/0x50 [Tue Aug 20 00:02:00 2024] ? exc_invalid_op+0x1e/0x70 [Tue Aug 20 00:02:00 2024] ? handle_exception+0x101/0x101 [Tue Aug 20 00:02:00 2024] ? exc_overflow+0x60/0x60 [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x3b0/0x440 [Tue Aug 20 00:02:00 2024] ? exc_overflow+0x60/0x60 [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x3b0/0x440 [Tue Aug 20 00:02:00 2024] ? xfs_dax_write_iomap_end+0xa0/0xa0 [Tue Aug 20 00:02:00 2024] xfs_buffered_write_iomap_end+0x52/0xc0 [Tue Aug 20 00:02:00 2024] ? xfs_buffered_write_iomap_end+0xc0/0xc0 [Tue Aug 20 00:02:00 2024] iomap_iter+0xce/0x4b0 [Tue Aug 20 00:02:00 2024] ? xfs_dax_write_iomap_end+0xa0/0xa0 [Tue Aug 20 00:02:00 2024] iomap_file_buffered_write+0xa9/0x420 [Tue Aug 20 00:02:00 2024] xfs_file_buffered_write+0x9d/0x2e0 [Tue Aug 20 00:02:00 2024] xfs_file_write_iter+0xc9/0x100 [Tue Aug 20 00:02:00 2024] fileio_exec_async+0x25e/0x3a0 [scst_vdisk] [Tue Aug 20 00:02:00 2024] fileio_exec_write+0x2ce/0x400 [scst_vdisk] [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0xdd/0xf0 [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0xd7/0xf0 [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0xd1/0xf0 [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0xcb/0xf0 [Tue Aug 20 00:02:00 2024] vdev_do_job+0x36/0xe0 [scst_vdisk] [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0x8f/0xf0 [Tue Aug 20 00:02:00 2024] fileio_exec+0x1f/0x30 [scst_vdisk] [Tue Aug 20 00:02:00 2024] scst_do_real_exec+0x51/0x130 [scst] [Tue Aug 20 00:02:00 2024] scst_exec_check_blocking+0xa8/0x220 [scst] [Tue Aug 20 00:02:00 2024] scst_process_active_cmd+0x200/0x18f0 [scst] [Tue Aug 20 00:02:00 2024] scst_cmd_thread+0x15c/0x500 [scst] [Tue Aug 20 00:02:00 2024] ? prepare_to_wait_event+0x160/0x160 [Tue Aug 20 00:02:00 2024] kthread+0xd2/0x100 [Tue Aug 20 00:02:00 2024] ? scst_cmd_done_local+0x90/0x90 [scst] [Tue Aug 20 00:02:00 2024] ? kthread_complete_and_exit+0x20/0x20 [Tue Aug 20 00:02:00 2024] ret_from_fork+0x1c/0x28 [Tue Aug 20 00:02:00 2024] ---[ end trace 0000000000000000 ]--- [Tue Aug 20 00:02:00 2024] ------------[ cut here ]------------ [Tue Aug 20 00:02:00 2024] WARNING: CPU: 5 PID: 2048 at fs/iomap/buffered-io.c:993 iomap_file_buffered_write_punch_delalloc+0x2f0/0x440 [Tue Aug 20 00:02:00 2024] Modules linked in: iscsi_scst(O) scst_vdisk(O) scst(O) dlm quota_v2 quota_tree autofs4 tcp_bbr sch_fq udf crc_itu_t input_leds led_class ses enclosure hid_generic wmi_bmof edac_mce_amd crc32_pclmul usbhid aesni_intel crypto_simd uas hid usb_storage rapl r8169 bnx2 i2c_piix4 mpt3sas ccp i2c_core sha1_generic k10temp pcspkr video wmi backlight [Tue Aug 20 00:02:00 2024] CPU: 5 PID: 2048 Comm: disk042_0 Tainted: G S W O 6.1.106_LFS_FILE01 #1 [Tue Aug 20 00:02:00 2024] Hardware name: To Be Filled By O.E.M. X370 Pro4/X370 Pro4, BIOS P10.08 01/22/2024 [Tue Aug 20 00:02:00 2024] EIP: iomap_file_buffered_write_punch_delalloc+0x2f0/0x440 [Tue Aug 20 00:02:00 2024] Code: 8b 7d f0 01 c2 c1 e2 0c c7 45 d8 00 00 00 00 89 55 d4 39 d6 89 f9 83 d9 00 0f 8d 1e ff ff ff 89 75 d4 89 7d d8 e9 13 ff ff ff <0f> 0b 39 45 dc 8b 4d e4 19 d1 0f 8c b8 00 00 00 8b 45 ec 8b 7d dc [Tue Aug 20 00:02:00 2024] EAX: a733f000 EBX: 00000000 ECX: a733f000 EDX: 00000000 [Tue Aug 20 00:02:00 2024] ESI: a733f000 EDI: 00000000 EBP: 86df3bd0 ESP: 86df3b80 [Tue Aug 20 00:02:00 2024] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010246 [Tue Aug 20 00:02:00 2024] CR0: 80050033 CR2: 37f87000 CR3: 0185a6e0 CR4: 00350ef0 [Tue Aug 20 00:02:00 2024] Call Trace: [Tue Aug 20 00:02:00 2024] ? show_regs.cold+0x16/0x1b [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x2f0/0x440 [Tue Aug 20 00:02:00 2024] ? warn+0x87/0xe0 [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x2f0/0x440 [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x2f0/0x440 [Tue Aug 20 00:02:00 2024] ? report_bug+0xe5/0x170 [Tue Aug 20 00:02:00 2024] ? exc_overflow+0x60/0x60 [Tue Aug 20 00:02:00 2024] ? handle_bug+0x2a/0x50 [Tue Aug 20 00:02:00 2024] ? exc_invalid_op+0x1e/0x70 [Tue Aug 20 00:02:00 2024] ? handle_exception+0x101/0x101 [Tue Aug 20 00:02:00 2024] ? exc_overflow+0x60/0x60 [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x2f0/0x440 [Tue Aug 20 00:02:00 2024] ? exc_overflow+0x60/0x60 [Tue Aug 20 00:02:00 2024] ? iomap_file_buffered_write_punch_delalloc+0x2f0/0x440 [Tue Aug 20 00:02:00 2024] ? xfs_dax_write_iomap_end+0xa0/0xa0 [Tue Aug 20 00:02:00 2024] xfs_buffered_write_iomap_end+0x52/0xc0 [Tue Aug 20 00:02:00 2024] ? xfs_buffered_write_iomap_end+0xc0/0xc0 [Tue Aug 20 00:02:00 2024] iomap_iter+0xce/0x4b0 [Tue Aug 20 00:02:00 2024] ? xfs_dax_write_iomap_end+0xa0/0xa0 [Tue Aug 20 00:02:00 2024] iomap_file_buffered_write+0xa9/0x420 [Tue Aug 20 00:02:00 2024] xfs_file_buffered_write+0x9d/0x2e0 [Tue Aug 20 00:02:00 2024] xfs_file_write_iter+0xc9/0x100 [Tue Aug 20 00:02:00 2024] fileio_exec_async+0x25e/0x3a0 [scst_vdisk] [Tue Aug 20 00:02:00 2024] fileio_exec_write+0x2ce/0x400 [scst_vdisk] [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0xdd/0xf0 [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0xd7/0xf0 [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0xd1/0xf0 [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0xcb/0xf0 [Tue Aug 20 00:02:00 2024] vdev_do_job+0x36/0xe0 [scst_vdisk] [Tue Aug 20 00:02:00 2024] ? switch_to_asm+0x8f/0xf0 [Tue Aug 20 00:02:00 2024] fileio_exec+0x1f/0x30 [scst_vdisk] [Tue Aug 20 00:02:00 2024] scst_do_real_exec+0x51/0x130 [scst] [Tue Aug 20 00:02:00 2024] scst_exec_check_blocking+0xa8/0x220 [scst] [Tue Aug 20 00:02:00 2024] scst_process_active_cmd+0x200/0x18f0 [scst] [Tue Aug 20 00:02:00 2024] scst_cmd_thread+0x15c/0x500 [scst] [Tue Aug 20 00:02:00 2024] ? prepare_to_wait_event+0x160/0x160 [Tue Aug 20 00:02:00 2024] kthread+0xd2/0x100 [Tue Aug 20 00:02:00 2024] ? scst_cmd_done_local+0x90/0x90 [scst] [Tue Aug 20 00:02:00 2024] ? kthread_complete_and_exit+0x20/0x20 [Tue Aug 20 00:02:00 2024] ret_from_fork+0x1c/0x28 [Tue Aug 20 00:02:00 2024] ---[ end trace 0000000000000000 ]---

Now my question is: Where we have to search for the problem? At the kernel- or at SCST source code?

Thanks in advance and for further investigation, Mike

Mike-Speedcracker commented 1 week ago

Addendum:

I asked ChatGPT what the cause could be by looking at the kernel 6.1.91 and 6.1.92 changelogs. Here is the answer...


After comparing the changelogs for Kernel 6.1.91 and 6.1.92, I noticed several XFS-related changes in 6.1.92, particularly around Copy-on-Write (CoW) and buffer management. These updates could be the root cause of your issue, especially since your setup uses SCST, which interacts with the filesystem. The changes in buffer invalidation handling during CoW operations may be causing conflicts with SCST's I/O operations, leading to the backtrace.

Testing different I/O patterns or temporarily reverting these changes might help narrow down the issue further.

Perhaps it is interesting? :)

Regards, Mike