cameron314 / concurrentqueue

A fast multi-producer, multi-consumer lock-free concurrent queue for C++11
Other
9.53k stars 1.66k forks source link

Free block concurrent to extracting an element from the block #325

Closed graphicsMan closed 1 year ago

graphicsMan commented 1 year ago

I upgraded to the latest mainline version of the moodycamel concurrentqueue library recently. I see TSAN errors around freeing a block inside of (this->parent->add_block_to_free_list(block)) on one thread while another thread is getting an element from the block (element = std::move(el); // NOLINT)

It appears as though the block is being returned to the allocator via std::free concurrent to another thread moving the element out of that block.

One relevant difference appears to be the new RECYCLE_ALLOCATED_BLOCKS setting.

Relevant stack traces look like this:

Write of size 8 at 0x7b4800022ef8 by thread T109:

Previous read of size 8 at 0x7b4800022ef8 by thread T158:

Nothing obvious jumps out at me as buggy. One possibility is that maybe due to memory order, the set_empty decision might occur prior to the element move, which would allow the free of the block even while another thread still could use the block? Completely spitballing.

If helpful, I can try to make a repro case. It's currently possible to repro this with open source code that isn't too terribly complex with a simple replacement of the old moodycamel code with recent version. Let me know what would be helpful.

Additionally, if you are completely sure that this is falsely triggered by TSAN, please let me know that too. I may need to make a wrapper for the concurrentqueue that turns off TSAN for every function I use.

graphicsMan commented 1 year ago

Based on my reading of set_empty, assuming there is not some other non-obvious issue, because the fetch_add is using memory_order_release, the compiler should not allow reordering of the element move to ever happen after the block is freed. This very likely falls into the category of false positives for TSAN.

cameron314 commented 1 year ago

I suspect the same thing, though I haven't had time to double check yet.

In general, tsan produces false positives with low-level lock-free data structures unless they're perfectly annotated, which mine is not. I believe you can also specify a file-glob pattern in a suppression file to ignore warnings coming from using this header.