microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
10.21k stars 1.51k forks source link

`<shared_mutex>`: debug visualizer #853

Open AlexGuteniev opened 4 years ago

AlexGuteniev commented 4 years ago

Today I helped my colleague to pinpoint deadlock cause in VS2015 project It was 30 seconds debug session. The first 10 seconds were as following: vis

But I think this feature seems to be lost for VS2019 .

What is more concerning, looks like it cannot be implemented for <shared_mutex>, since it is SRWLOCK based.

And I expect that vNext <mutex> will be the same, as I think it should be either SRWLOCK based or atomic wait based.

Is this correct that the goal of performance dominates here, so there will not be extra debug facilities, even in debug build? Any suggestions on deadlock debugging?

vNext note: Resolving this issue will require breaking binary compatibility. We won't be able to accept pull requests for this issue until the vNext branch is available. See #169 for more information.

StephanTLavavej commented 4 years ago

There’s one precedent for storing extra info purely for debug visualization (in regex; not sure if it was a good idea) but I don’t think switching to a completely different implementation in debug mode would be a good idea; that changes the behavior of the program compared to release mode significantly, even given the radical changes that debug mode already triggers (taking locks, etc.).

I haven’t debugged deadlocks but the other maintainers might have suggestions.

AlexGuteniev commented 4 years ago

I'm not looking into switching from SRWLOCK to, say, CRITICAL_SECTION.

I'm looking at following options:

Sure there are other ways to debug deadlocks. But this one - by using visualizer - is extremely productive.

I would still mind extra function call in release. But maybe something like "iterator debug level two" could be used here...

AlexGuteniev commented 4 years ago

Storing owner thread id could also be helpful to diagnose recursion on non-recursive shared_mutex (ans as I know, there's no shared_recursive_mutex, not even in plans). lock could have thrown system_error with resource_deadlock_would_occur, it is permitted to. But it also would require noexcept unstrengthening.

I'm mostly question if adding extra facitities to mutexes that would be active only debug mode is a good idea or not.

AlexGuteniev commented 4 years ago

My current opinion on this:

Mutexes, including recursive/nonrecursive, timed/nontimed, shared/nonshared could include (exclusively) owning thread id member - in both debug and release. For release builds, initialization of that member can be skipped, if it is not used otherwise than for debug helping (such as visualizer, or throwing deadlock exception).


Regarding whether SWRLOCK is the right way to implement mutexes. I'm not sure.

Arguments for implementing mutexes on WaitOnAddress:

Arguments against WaitOnAddress:

I'm not sure which is right decision here.

(However, I'm confident that for call_once "inlining implementation" is strong argument (it is DCL, not lock, so inlining matters in significant way), and the implementation is trivial, that's why #707 )

AdamBucior commented 4 years ago

Mutexes, including recursive/nonrecursive, timed/nontimed, shared/nonshared could include (exclusively) owning thread id member - in both debug and release. For release builds, initialization of that member can be skipped, if it is not used otherwise than for debug helping (such as visualizer, or throwing deadlock exception).

Why add this member in release mode if it's only used in debug mode?

AlexGuteniev commented 4 years ago

Why add this member in release mode if it's only used in debug mode?

To minimize had of misconfiguration when debug and release are used together. Maybe it is not needed, and detect_mismatch is enough.

AdamBucior commented 4 years ago

To minimize had of misconfiguration when debug and release are used together.

Release mode should focus on performance and not on trying to be ABI-safe with debug mode. Mixing debug and release modes is already not safe and shouldn't be if it would affect release mode performance.

StephanTLavavej commented 4 years ago

I agree with @AdamBucior - we've blocked mixing release and debug for many years with: https://github.com/microsoft/STL/blob/212de15c590010c961c0e7bb4025f6fec89866c8/stl/inc/yvals.h#L142-L152 Storing a thread ID in debug mode, which isn't otherwise used by the implementation, seems like a fairly reasonable cost to make certain debugging scenarios significantly easier. Marking this as a vNext issue.