Closed benaadams closed 4 months ago
Should this type not be sealed
?
Also, should there be some in-box analyzers shipped alongside? I can think of:
Should this type not be
sealed
?
Yep added, thanks
I'm not clear on the extent of this proposal.
You're proposing that Monitor would special-case this with its existing object-based methods? But only some of Monitor's methods?
I'm not clear on the extent of this proposal.
I'm proposing a type that doesn't need to concern itself with the state of the syncblock so it will only be a thinlock and not worry about handling the flags BIT_SBLK_IS_HASH_OR_SYNCBLKINDEX
, BIT_SBLK_IS_HASHCODE
, BIT_SBLK_SPIN_LOCK
etc
So no special treatment anywhere in the runtime it's just setup so that it'll never have a stored hashcode and the syncblock will only ever be used for locking? Though it'd be a pretty niche case where you have a lock on something that has been hashed i'd have thought.
I'm proposing a type that doesn't need to concern itself with the state of the syncblock so it will only be a thinlock and not worry about handling the flags
I understand. But how are you proposing interacting with it? Still via Monitor?
Ah, get you point. Have added other methods.
Would still need to go via monitor currently to work with the lock
statement?
Though it'd be a pretty niche case where you have a lock on something that has been hashed i'd have thought.
Yes; but the handling for the niche case isn't without overhead. So the idea is to introduce a type that drops this overhead by being explicit in only being a lock.
Would still need to go via monitor currently to work with the lock statement?
Today, yes, though likely C# could be augmented to recognize the type in the future.
So the idea is to introduce a type that drops this overhead by being explicit in only being a lock.
A potentially much more impactful (and experimental) variation of this would be to actually only support locking on such a type and eliminate the memory overhead incurred for every other object in the system. Until code fully transitioned, locking on other objects would obviously be more expensive as the runtime would need to employ some kind of mapping to a Lock, e.g. https://github.com/dotnet/corert/blob/d0683071605aed956497506ed1aae4d366be0190/src/System.Private.CoreLib/src/System/Threading/Monitor.cs#L28
/cc @davidwrighton as FYI since we were coincidentally talking about this just the other day.
:) This brings back memories of the early days on the .NET Native project, when we actually implemented this type, and tied it into Monitor (so uses of this Lock
type were really great, and other objects had a perf penalty due to the cost of checking for the Lock
type. We eventually concluded penalizing all existing Monitor.Lock
usage (even slightly) was a poor choice, and reverted back to supporting arbitrary object locking as the first class primitive. One idea I had at the time, which I never explored in detail was the interesting fact that in CoreCLR, there is an empty and unused pointer sized field in every System.Object instance. In theory, if optimizing Monitor performance really only needs a single non-gc tracked pointer field, and most locking occurs on instances of the exact System.Object type, we could use that instead. This would have the same drawbacks a an integrated Lock object for reducing the performance of locking on other object types, but we've been telling customers for years to lock on System.Object instances. Perhaps we could find out if that would be a worthwhile optimization.
If we want to have a lock object type not integrated into Monitor
, then there is quite a bit of work required to make sure it integrates with the diagnostics infrastructure and the threading model correctly. (Monitor is an interruptible lock as it uses a COM interruptible wait so that STA frameworks such as WinForms and WPF work correctly. Any other lock that is intended for general use will also need to follow those strictures.)
in CoreCLR, there is an empty and unused pointer sized field in every System.Object instance
Why? What prevents us from just removing that?
interesting fact that in CoreCLR, there is an empty and unused pointer sized field in every System.Object instance.
Is this due to the min size of 12 bytes?
This would have the same drawbacks a an integrated Lock object for reducing the performance of locking on other object types, but we've been telling customers for years to lock on System.Object instances.
What would be performance the drawback? Due to an extra _lock.GetType() == typeof(object)
check to determine the code path?
.NET Native project, when we actually implemented this type, and tied it into Monitor
This implementation is still there .
In CoreCLR, the lock can take two paths: The thinlock path that just flips the bit in the object header is used as long as there is no contention that requires waiting or other special situations. Once the thin lock hits the contention, the syncblock (a piece of unmanaged memory) is allocated for the object by the VM, and all lock operations go through the syncblock. Going through syncblock has extra overhead - there are several indirections required to get to syncblock.
In CoreRT, the thinlock path is same as in CoreCLR (the thin lock path was not there in the early .NET Native days). Once the lock hits contention, one of these Lock
objects gets attached to the object via ConditionalWeakTable. The Lock
object + ConditionalWeakTable
is basically a replacement for syncblock as far as Monitor is concerned. I believe that the same strategy would work for CoreCLR too.
min size of 12/24 bytes
Here is where the min size comes into play: The current GC implementation needs to be able replace any chunk of free space with a special fake array to keep the heap walk-able. The object minimum size is a minimum size of this fake array. If we were to get rid of the free space on System.Object, we would need to introduce a special fake object that would be used for free space where the fake array does not fit, and all places in the GC that check for the free space would need to check for both these. Some of these checks are in hot loops that would take a perf hit.
The technically most challenging part for getting rid of object header is dealing with object hashcodes (ie RuntimeHelpers.GetHashCode
) during GC object relocation. Many other components depend on these hashcodes, e.g. ConditionalWeakTable. Until we have satisfactory solution for that, the rest does not matter.
What about making the Lock
type a struct, analogous to SpinLock
? Perhaps it could save some allocations, and it could potentially be integrated into the lock
statement while keeping it separate from locking on arbitrary objects with lock (ref _lock)
or even just lock (_lock)
with implied passing by ref particularly for the Lock
type.
The current implementation of NativeAOT locking is here:
System.Threading.Lock
that supports all scenarios expected from a recursive mutex.
This is a hybrid lock that will do a blocking wait after a bit of spinning.
While short-held locks benefit from spinning, for longer-held locks spinning is wasteful and more eager sleeping saves resources. The Lock will detect these cases and self-adjust based on the past history of the lock.NativeAOT managed runtime often bypasses the thin lock stage and just uses Lock
object directly. That makes sense for long-lived locks or locks that will inevitably inflate.
I think the NativeAOT Lock
could work nicely as a first class Lock type.
I have low expectations and am happy with a Lock
type that inherits from Object
and throws on GetHashCode
Currently every class is sorta a lock, but there is no type that expresses the intention, so when you want a vanilla lock you do
object _lock = new object();
Which seems weird; and incongruous with .NET where every api is spelled out very clearly and expresses its intent in naming. That clarity in apis is one of the things .NET is very good at, however its missing for the fundamental lock type, perhaps because everything is a lock?
i.e. to a new .NET developer wanting a lock; the answer isn't "use a lock" but "use any object" which isn't an obvious answer
Currently every class is sorta a lock, but there is no type that expresses the intention, so when you want a vanilla lock you do
object _lock = new object();
Tbf, I think as part of the early early design of the CLR, it was thought that locking on this
would be the standard thing to do, which necessitated the ability to make every class instance a lock. However, I guess since the languages didn't restrict the argument to be only this
, they ended up with the ability to use other object instances as locks, including objects that already used lock (this)
making it easy to deadlock. After that, it became the wisdom that locks should only be taken on owned objects completely invisible to outside consumers, which is where we are today.
Tbf, I think as part of the early early design of the CLR, it was thought that locking on
this
would be the standard thing to do, which necessitated the ability to make every class instance a lock. However, I guess since the languages didn't restrict the argument to be onlythis
, they ended up with the ability to use other object instances as locks, including objects that already usedlock (this)
making it easy to deadlock. After that, it became the wisdom that locks should only be taken on owned objects completely invisible to outside consumers, which is where we are today.
Sure and I'm not saying remove the standard locking mechanism; it is handy when you don't want an additional allocation, just introduce a type that conveys intention rather than using object
for when you want an explicit lock object
I think that the name Lock
is problematic for this class. Declaring a Lock
field is OK:
private readonly Lock _lock = new();
But declaring a local variable is a compile-time error:
Lock lock = new(); // Syntax error
The lock
is a C# keyword. And having to bypass this restriction by naming the variable @lock
is a bummer.
The
lock
is a C# keyword. And having to bypass this restriction by naming the variable@lock
is a bummer.
Would be uncommon? When would you define a local lock? Would only serve a purpose if you where also passing it as a parameter to methods that would execute in parallel otherwise it would have no function, as what is there to lock?
@benaadams it's indeed not very common. Last time I had a legitimate reason to use a local locker was two years ago, for synchronizing the localFinally
delegate of a Parallel.ForEach
loop:
public double RectangularIntegration(double xp, double xk, int n)
{
object locker = new();
double result = 0.0;
// Initializing the Partitioner and the ParallelOptions is omitted
Parallel.ForEach(partitioner, options, () => 0.0, (range, state, accumulator) =>
{
// Calculations that update the accumulator are omitted
return accumulator;
}, accumulator =>
{
lock (locker) { result += accumulator; }
});
return result;
}
In retrospect I could have used the partitioner
or the options
as the locker, instead of instantiating an object dedicated for locking. Actually I can find only 2-3 cases the last 4 years that I needed a local locker, and there was no other object available to use for this purpose.
@theodorzoulias @benaadams Keep in mind that not all code styles use a prefix for private fields. So the lock
name is problematic even outside of the context of local variables.
@theodorzoulias @benaadams Keep in mind that not all code styles use a prefix for private fields. So the
lock
name is problematic even outside of the context of local variables.
You know you can call the variable or field what ever you want? Is what you might name a variable a consideration what you might name the class? 🤔
@benaadams I am looking at the C# keywords, and I can find only one non-contextual keyword that has the same name with a .NET Class (differing in the capitalization of the first letter), excluding synonyms like decimal
-Decimal
, and the static class Volatile
. It's the keyword delegate
. Making the lock
the second occurrence of such a resemblance would continue a bad trend IMHO. I have already trouble naming my Delegate
variables!
locker, gate, sync, xxxLock etc - there are lots of options how to name a variable and there is no rule that variable should be named as a 'className'.
Personally I find xxxLock where xxx - some context to be the best choice because when you're using a lock you're using it to achieve 'something', not just because you want a lock for a locks sake.
Is it collection expansion lock, is it read lock, is it a write lock etc?
@En3Tho there is no rule that a List<T>
variable must be named list
, but imagine if the compiler would punish you if you tried to do so. It wouldn't be nice. But as a thought experiment let's say that the C# compiler allowed us to declare a Lock
variable with the name lock
. What would you thing about the code below?
lock (lock) result += accumulator;
Doesn't it look strange?
@theodorzoulias Well, if this Lock type could get a special treatment and compiler would allow 'lock' as variable name I would rather see something like
lock { result += accumulator }
Or
using lock; //if Lock is IDisposable
result += accumulator
Otherwise I would rather just call it accumulatorLock, accLock or writeLock Something like that.
As for your concrete example I guess it leans to the side of vars, asyncs and awaits with all that contextual keyword complexity. I would rather let keyword be keywords.
@En3Tho seeing the using lock;
, which btw reminds me of Scott Hanselman's using (await mutex.LockAsync())
, makes me wonder if anyone has suggested adding in the C# a lock
declaration syntax similar to C# 8's using declaration syntax: lock locker; /* Unlock at the end of the current scope */
. I guess not, because it's generally desirable to release a lock as early as possible.
The uses of the Lock class in CoreLib pretty much all don't use the lock directly but instead use a holder:
The pattern is:
using (LockHolder.Hold(_lock))
{
}
Before the introduction of LockHolder
, the pattern was more clumsy:
_lock.Acquire();
try
{
}
finally
{
_lock.Release();
}
So maybe the API proposal should come with a wrapper as well.
using lock;
feels odd because we're not disposing the lock - it would still be usable after the dispose.
So maybe the API proposal should come with a wrapper as well.
If the new lock type is integrated with the lock
keyword it could work similarly to a holder. The compiler could generate similar code when it knows the type. Would there be other benefits to offer a holder type?
locker, gate, sync, xxxLock etc
MutexSlim
may be another option, as there are already other 'slim' types of synchronization objects, though I'm not against Lock
if the new lock type is integrated with the lock keyword it could work similarly to a holder
Rather than teach lock about this one additional type, I'd want us to consider a route where it's pattern-based, just like other C# keywords. Then we'd expose the right surface area on all of our relevant types to integrate via that pattern, e.g. SemaphoreSlim, Mutex, etc.
MutexSlim may be another option, as there are already other 'slim' types of synchronization objects, though I'm not against Lock
I'd prefer just Lock or something else similarly simple. If this is to be the recommended way to write new code that locks, we should choose the simplest/most obvious name possible.
(Based purely on the proposed surface area, though, there will continue to be uses for locking on arbitrary objects, not just for reasons of using some already allocated object (which is very common in our higher-performance code), but also for situations where Monitor is used for its eventing support.)
I'd want us to consider a route where it's pattern-based, just like other C# keywords. Then we'd expose the right surface area on all of our relevant types to integrate via that pattern, e.g. SemaphoreSlim, Mutex, etc.
It would be a breaking change. For example, lock (semaphoreSlim) { ... }
works today. Pattern-based would change what it does.
Pattern-based would change what it does.
If for the existing classes we changed the type itself. But if we're talking anyway about adding AsHolder-like methods, the returned type could have the pattern, and I think there's something to be said for using lock instead of using
.
For existing structs, like SpinLock, you'd be horribly broken if using it with lock today, and we could take that "breaking change".
For new lock types, there's no breaking change.
I think we should consider it even if we ultimately decide against it. The biggest and real downside is a difference in behavior based on whether the pattern is exposed.
It would be a breaking change. For example,
lock (semaphoreSlim) { ... }
works today. Pattern-based would change what it does.
What if the way to consume in the language was something like using lock (patternedLockObject) { ... }
? That's something that shouldn't break any existing usages of lock
, does it?
I think we should consider it even if we ultimately decide against it. The biggest and real downside is a difference in behavior based on whether the pattern is exposed.
I agree that we should consider this. FWIW, pattern-based Dispose had similar problem and we have decided to limit it to ref-structs only to avoid the potentially breaking behavior changes in existing code.
I am not sure lock (semaphoreSlim) { ... }
would be a good idea.
lock
makes sense only with critical sections or mutex-like primitives where access to something is owned by a thread. I can think only of SpinLock
and Lock
.
Other synchronization types like Semaphores, Events, EventHandles, do not seem to be good candidates for lock
pattern, since there is no "ownership" and waiter and releaser are often different threads.
lock makes sense only with critical sections or mutex-like primitives where access to something is owned by a thread.
Not really; it doesn't need to be owned by a thread, just have enter/exit semantics. For a semaphore, it's very common to use it as a gate for allowing a limited number of logical threads of execution in a protected region at any one time, with a (non-recursive) mutex just being the special-case of that limited number == 1. So it's quite common to see code like:
semaphore.Wait();
try
{
... // region where only N operations are allowed at a time
}
finally
{
sempahore.Release();
}
In fact, SemaphoreSlim
is our go-to lock right now for async, using that same pattern, e.g.
private readonly SemaphoreSlim _lock = new SemaphoreSlim(1, 1);
...
await _lock.WaitAsync();
try
{
... // protected region
}
finally
{
_lock.Release();
}
Binary semaphore can be used as a mutex. That may not be enough reason to make lock
statement an extension point.
Or maybe it is...
lock
that allows N threads into the critical section may be a bit unusual.
I would definitely prefer being able to do lock(_myLock){...}
with instances of Lock
If that does not work out, I think something like using(_myLock.Access){...}
would be ok too.
I think we should consider it even if we ultimately decide against it. The biggest and real downside is a difference in behavior based on whether the pattern is exposed.
Agree. I think we could speculate on ways to make the break less impactful. Also this is a common theme that comes up with improvements we want. Maybe it will help us find a way to work past other scenarios where potential breaks are a significant reason why we can't make progress on an idea.
For instance the idea of letting foreach (var x in col)
auto-lower to for (int i = 0; i < col.Count; i++)
for collections that implement certain patterns. Will likely improve perf, reduce cost on collection developers, etc ... It works great in vast majority of cases but becomes a break in certain corners like List<T>
where there are modifications during enumerations. Seems similar to the problem we're having here. How can we get the intended behavior for the majority of cases while finding a way to not break existing behaviors people depend on.
I agree the pattern based approach is the right direction for the reasons @stephentoub mentioned. Furthermore I think that pattern will end up kicking in if either C# 12 or higher is enabled or you're targeting net8.0
or higher. This is similar to how new ref
rules took affect when we did ref
fields. Once you move to a runtime that is featuring them, then there is no way to disable it. That means that lock()
on instances of Lock
always works the same way. Don't have to worry about LangVersion
changing behavior.
From there though I think it's about whether the pattern is:
lock
pattern. Objects that meet the pattern but don't want to participate have some opt out mechanism (like adding an attribute)Feel like (1) would lead to a number of unintentional breaks. Seems like inevitably we'd find customers that defined the lock pattern we chose on objects they also put into a lock
.
Feel like (2) is safer because it would be limited to types that explicitly want to opt into that pattern. Yes there are probably a few customers who wrote a lock (semaphoreSlim)
that would see a behavior change. But my intuition is that would be lower than (1). There are probably only a handful of types we'd want in this pattern so the cost of opt-in via attribute would be very manageable
That means that lock() on instances of Lock always works the same way.
That may require to do something about casts like (object)_myLock
. Perhaps make it a warning?
Regarding something like this where locks may be passed around as objects:
private static readonly Lock _lock = new Lock();
private static void Foo()
{
ThreadPool.QueueUserWorkItem(locker =>
{
lock (locker)
{
// ...
}
}, _lock);
}
In a pattern-based approach that only the compiler knows about, the above would use Monitor
and not Lock
, and it would not be sufficient to just change the type of the lock object to Lock
to get the expected behavior. In order to get consistent behavior, the entry point for lock(obj)
where obj
is an object
would also have to know about all the class types involved with special behaviors (and check for them at run-time). If the discrepancy in cases like above is acceptable, then that would also be great because we wouldn't have to add any overhead to the current cases where lock
is used with an object
. Thoughts?
That may require to do something about casts like (object)_myLock
I'm 50/50 on this.
Consider that with all breaking changes we need to give customers an out. A way of getting the semantics they had before the breaking change came into play. For example some customer who took a bet on lock(sempahoreSlim)
. Need to provide a simple, reliable way that gives them the old semantics. Assuming we take the pattern based approach the easiest way is lock((object)semaphoreSlim)
.
For the Lock
type specifically there is no back compat concern. It's a new scenario. Seems feasible to warn there.
public override int GetHashCode() => throw new NotSupportedException();
Is it necessary for GetHashCode()
to throw?
Should we add the following overload in addition to the one that takes TimeSpan
?
namespace System.Threading
{
public sealed class Lock
{
public bool TryEnter(int millisecondsTimeout);
}
}
The TimeSpan
overload would convert it to an int millisecond value anyway, and it may slightly ease transitioning from Monitor
to Lock
.
Similarly:
public bool IsEntered();
public override int GetHashCode() => throw new NotSupportedException();
Is it necessary for
GetHashCode()
to throw?
Originally I was just envisioning it as a direct inheritor of object
so not allowing hashcode would mean it remained a fast lock via object header
Originally I was just envisioning it as a direct inheritor of
object
so not allowing hashcode would mean it remained a fast lock via object header
I think Lock
should have its own separate implementation that wouldn't use sync blocks at all.
Here's a proposal that was discussed offline for a pattern-based approach that would enable types to define custom behaviors for entering and exiting a lock when the type is used with the lock
keyword.
Example:
class Lock
{
public LockHolder EnterLockWithHolder();
public struct LockHolder
{
public void Dispose();
}
}
Requirements for matching the pattern:
Lock
in this example) has an accessible instance method EnterLockWithHolder()
. Extension methods don't qualify for the pattern.EnterLockWithHolder()
returns a type (LockHolder
in this example) that has an accessible instance method void Dispose()
lock
keyword is determined by looking at the type of the expression, such that in the example lock(_lock)
would match the pattern but lock((object)_lock)
would notThe pattern would also enable structs like SpinLock
to opt in with lock(_spinLock)
.
A few alternatives were considered, this seemed like a reasonable approach.
With that, the Lock
type proposal may look like the edited proposal in the OP (added a few methods and notes).
I figure the remaining questions could be discussed in the API review, so marking as ready for review for now. Let me know if there are any objections.
Background and Motivation
Locking on any class has overhead from the dual role of the syncblock as both lock field and hashcode et al. (e.g. https://github.com/dotnet/runtime/issues/34800)
Adding a first class lock type that didn't allow alternative uses and only acted as a lock would allow for a simpler and faster lock as well as be less ambiguous on type and purpose in source code.
API proposal
[Edit] by @kouvel based on https://github.com/dotnet/runtime/issues/34812#issuecomment-1485837123 and https://github.com/dotnet/csharplang/issues/7104
API Usage
Change in usage
Becomes the clearer
[Edit] by @kouvel
Usage example 1
Usage example 2
Usage example 3
After the
lock
statement integration described in https://github.com/dotnet/runtime/issues/34812#issuecomment-1485837123, the following would useLock.EnterScope
andScope.Dispose
instead of usingMonitor
.TryEnterScope usage example
The following:
Could be slightly simplified to:
The latter could also help to make the exit code path less expensive by avoiding a second thread-static lookup for the current thread ID.
Alternative Designs
Scope
could be a regular struct instead of aref struct
. Aref struct
helps to make the exit code path less expensive and can improve performance in some cases, as it guarantees thatScope.Dispose
is called on the same thread asEnterScope
, and the path that exits the lock would be able to avoid an extra thread-static lookup for the current thread ID. Similarly forTryScope
.TryEnterScope
methods, an alternative is to return a nullable struct. It seems more readable to check forWasEntered
than to check for null or theHasValue
property to see if the lock was entered. Also aref struct
can't be nullable currently, and a regular struct would not benefit from a potentially less expensive exit path.IsHeld
instead ofIsHeldByCurrentThread
. The former is simpler, though may be a bit ambiguous, and may be confusing againstSpinLock.IsHeld
, where it meansIsHeldByAnyThread
.Lock
type a struct instead. This may save some allocations in favor of increasing the size of the containing type. Structs can be a bit more cumbersome to work with, and the benefits may not be large enough to be worthwhile.Risks