Open dvyukov opened 9 years ago
Here are reports with similar stacks. However, the stacks are reversed, so it can be missed synchronization "via a decide'. FTR, the csd is issued in __blk_complete_request.
ThreadSanitizer: data-race in llist_reverse_order
Read at 0xffff88047e91a178 of size 8 by thread 3006 on CPU 6:
[<ffffffff8156abbe>] llist_reverse_order+0x2e/0x60 lib/llist.c:97
[<ffffffff81133e0d>] flush_smp_call_function_queue+0x5d/0x1c0 kernel/smp.c:223 (discriminator 3)
[<ffffffff81134c57>] generic_smp_call_function_single_interrupt+0x17/0x80 kernel/smp.c:195
[< inline >] __smp_call_function_single_interrupt arch/x86/kernel/smp.c:309
[<ffffffff8105a4c8>] smp_trace_call_function_single_interrupt+0x58/0x160 arch/x86/kernel/smp.c:324
[<ffffffff81ee54ea>] trace_call_function_single_interrupt+0x8a/0xa0 arch/x86/entry/entry_64.S:811
Previous write at 0xffff88047e91a178 of size 8 by thread 57 on CPU 6:
[<ffffffff8154527f>] cfq_insert_request+0xef/0xc70 block/cfq-iosched.c:4018
[<ffffffff81509ab3>] __elv_add_request+0x293/0x4e0 block/elevator.c:659
[<ffffffff81516387>] blk_flush_plug_list+0x3a7/0x420 block/blk-core.c:3204
[< inline >] blk_schedule_flush_plug include/linux/blkdev.h:1077
[< inline >] sched_submit_work kernel/sched/core.c:3096
[<ffffffff81eddadf>] schedule+0x5f/0x80 kernel/sched/core.c:3103
[< inline >] add_transaction_credits fs/jbd2/transaction.c:207
[<ffffffff813b5eba>] start_this_handle+0x40a/0xa20 fs/jbd2/transaction.c:339
[<ffffffff813b666d>] jbd2__journal_start+0x19d/0x330 fs/jbd2/transaction.c:440
[<ffffffff8138ddca>] __ext4_journal_start_sb+0x9a/0x180 fs/ext4/ext4_jbd2.c:76
[< inline >] __ext4_journal_start fs/ext4/ext4_jbd2.h:312
[<ffffffff8133ff4b>] ext4_writepages+0x5eb/0x15e0 fs/ext4/inode.c:2492
[<ffffffff811e0453>] do_writepages+0x53/0x80 mm/page-writeback.c:2332
[<ffffffff812aac2f>] __writeback_single_inode+0x7f/0x510 fs/fs-writeback.c:1259 (discriminator 3)
[<ffffffff812ab5b8>] writeback_sb_inodes+0x4f8/0x720 fs/fs-writeback.c:1516
[<ffffffff812ab8a4>] __writeback_inodes_wb+0xc4/0x100 fs/fs-writeback.c:1562
[<ffffffff812abc8c>] wb_writeback+0x3ac/0x440 fs/fs-writeback.c:1666
[< inline >] wb_do_writeback fs/fs-writeback.c:1801
[<ffffffff812ac7b4>] wb_workfn+0x214/0x7e0 fs/fs-writeback.c:1852
[<ffffffff810b1d6e>] process_one_work+0x47e/0x930 kernel/workqueue.c:2036
[<ffffffff810b22d0>] worker_thread+0xb0/0x900 kernel/workqueue.c:2170
[<ffffffff810bba40>] kthread+0x150/0x170 kernel/kthread.c:209
[<ffffffff81ee420f>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:529
ThreadSanitizer: data-race in llist_reverse_order
Write at 0xffff88047e000758 of size 8 by thread 763 on CPU 8:
[<ffffffff8156abc9>] llist_reverse_order+0x39/0x60 lib/llist.c:98
[<ffffffff81133e0d>] flush_smp_call_function_queue+0x5d/0x1c0 kernel/smp.c:223 (discriminator 3)
[<ffffffff81134c57>] generic_smp_call_function_single_interrupt+0x17/0x80 kernel/smp.c:195
[< inline >] __smp_call_function_single_interrupt arch/x86/kernel/smp.c:309
[<ffffffff8105a4c8>] smp_trace_call_function_single_interrupt+0x58/0x160 arch/x86/kernel/smp.c:324
[<ffffffff81ee54ea>] trace_call_function_single_interrupt+0x8a/0xa0 arch/x86/entry/entry_64.S:811
Previous read at 0xffff88047e000758 of size 8 by thread 2823 on CPU 0:
[< inline >] cfq_check_fifo block/cfq-iosched.c:2895
[< inline >] cfq_dispatch_request block/cfq-iosched.c:3369
[<ffffffff8154321e>] cfq_dispatch_requests+0x58e/0x1550 block/cfq-iosched.c:3410
[< inline >] __elv_next_request block/blk.h:149
[<ffffffff81515916>] blk_peek_request+0xc6/0x4f0 block/blk-core.c:2284
[<ffffffff8189df8b>] scsi_request_fn+0x6b/0xb20 drivers/scsi/scsi_lib.c:1784
[< inline >] __blk_run_queue_uncond block/blk-core.c:310
[<ffffffff8150c06f>] __blk_run_queue+0x6f/0xa0 block/blk-core.c:328
[<ffffffff8150c0df>] blk_run_queue+0x3f/0x70 block/blk-core.c:360
[<ffffffff81899a9a>] scsi_run_queue+0x51a/0x5f0 drivers/scsi/scsi_lib.c:497
[<ffffffff8189b09b>] scsi_end_request+0x1db/0x300 drivers/scsi/scsi_lib.c:735
[<ffffffff8189ec57>] scsi_io_completion+0x137/0x8f0 drivers/scsi/scsi_lib.c:914
[<ffffffff818911be>] scsi_finish_command+0x15e/0x1f0 drivers/scsi/scsi.c:607
[<ffffffff8[89ded2>] scsi_softirq_done+0x182/0x1d0
[<ffffffff8151fc86>] blk_done_softirq+0x136/0x160 block/blk-softirq.c:35
[<ffffffff81091c1e>] __do_softirq+0xbe/0x2f0 kernel/softirq.c:273
[<ffffffff81ee4d6a>] trace_apic_timer_interrupt+0x8a/0xa0 arch/x86/entry/entry_64.S:790
Here are two reports on commit c58422d251bc
__blk_complete_request queues softirq that unblocks scsi_execute/blk_execute_rq on a completion. KTSAN misses that synchronization for some reason.