Open jkotas opened 3 years ago
More context https://github.com/dotnet/designs/pull/189#discussion_r594683843 cc @GrabYourPitchforks
The one feature that I really wish the analyzer supported is "allow copies to read-only locations". This configuration would allow non-mutating operations to run against snapshot copies, which broadens applicability of [NonCopyable]
from strictly non-copyable types to more general usage on mutable value types to avoid mutating a copy that will not be observed.
allow copies to read-only location
It would not work for ValueStringBuilder
and other similar cases where the struct holds manually managed resource.
The one feature that I really wish the analyzer supported is "allow copies to read-only locations"
That wouldn't solve the ValueStringBuilder problem, though. At least, not if it uses ArrayPool.
Even without ArrayPool, overwrites would be visible on the copy (writing through into arrays), but appends wouldn't (changing fields).
I don't doubt that readonly copies can make sense in some cases, but they wouldn't solve this one, and feel like they're a weird knifes-edge in general.
It would not work for
ValueStringBuilder
and other similar cases where the struct holds manually managed resource.
Note that my comment was more for wishing we could do this:
[NonCopyable(AllowCopyToReadOnlyLocation = true)]
Types like ValueStringBuilder
would use the default value false
, but many value types which only need to be non-copyable during an initial mutable phase could leverage this for improved usability.
On a totally separate note, this feature is going to be extremely difficult to implement without the compiler exposing some sort of an API specifically to analyze value copies. Even with the conservative approach used by RS0042 (fail on any unrecognized construct) we've found a significant number of holes.
CC. @jaredpar. Would this conflict with or be superseded by any work being done on the language side?
I'm not aware of LDM work on this, but will let @jaredpar provide a more definitive answer. I stopped pushing on this as a language feature primarily due to questions surrounding the value of https://github.com/dotnet/runtime/issues/50389#issuecomment-809662254 (the compiler would not be able to provide this value through built-in language features, but an analyzer could account for it as long as an analyzer API for value copies existed).
I believe the ideal handling here would be a general analyzer callback for any copy in IL (without any consideration for the type being copied), and the analyzer for this feature would filter as necessary.
Coulpe of questions:
NonCopyable
or ValueStringBuilder
that you want to enforce with this analyzer? Couple that seem to apply:
[NonCopyable] struct
as a field of standard struct
async
method or iteratorvsb
represents a [NonCopyable] struct
: SomeMethod(ref vsb, ref vsb)
. [NonCopyable]
types forced to be ref struct
or can they be simple struct
? If the latter then that implies tearing is okay (is that the case)? On a totally separate note, this feature is going to be extremely difficult to implement without the compiler exposing some sort of an API specifically to analyze value copies.
I agree. There are a lot of subtle cases where the compiler does or does not copy a value. It's possible to write conservative analyzers which flag more copies than exist but are unlikely to miss any. It's hard to write one that exactly matches the behavior of the compiler though.
It's also important to know if we care about invisible copies. There are several places where the compiler will introduce copies in synthesized members that are invisible to the customer but do constitute copies of the values (consider the places where the compiler introduces thunks to meet some metadata contract and hence has to pass values through the thunk). Do this matter for this feature? My assumption is no but wanted to check.
⚠️ This comment is written from the perspective of all the reference types I've been making NonCopyable, along with the RS0042 (Do not copy value) analyzer. It may on may not apply to ValueStringBuilder
.
Disallowed as generic arguments
This is still a known hole in RS0042
Disallow [NonCopyable] struct as a field of standard struct
This was one of the holes we recently fixed in RS0042 (https://github.com/dotnet/roslyn-analyzers/pull/4798). It's also one of the things that makes https://github.com/dotnet/runtime/issues/50389#issuecomment-809662254 particularly desirable.
Disallow as locals in an async method or iterator
This should be safe right?
etc ...
Boxing is the main one. Originally I wrote RS0042 to disallow boxing, but later I realized that boxing is a safe option (when it's moved to a box) and unboxing is the problematic operation. A boxed value type can be accessed using Unsafe.Unbox
or through a virtual call.
Generally non-copyable types end up desiring to have move semantics. Essentially copying is bad but move is okay. Do you all desire this property here and if so how do you all define what constitutes a "move". Consider that returning a local from a function is likely a safe "move" operation.
For all the cases I've encountered to date, implicit move semantics were desirable (i.e. a copy is only a problem if the location is read after that copy). If a type is not movable, there are many additional constraints (e.g. cannot store as a field of a reference type unless that reference type is pinned, and cannot capture a local into a reference type).
Are multiple mutable references okay here?
💭 This should be a safe operation.
Are [NonCopyable] types forced to be ref struct or can they be simple struct?
For the types I've been working with, making the types a ref struct
is extremely undesirable. This breaks the ability to use the types in state machines, including async code.
Also related: https://github.com/dotnet/csharplang/issues/2372
Disallowed as generic arguments
I imagine an exception could be designed here, similar to how people want an exception to passing ref struct
s as a type argument:
public void M<[NonCopyableSafe] T>(ref T val)
{
// compiler checks that all uses of T don't make "forbidden" copies,
// then callers are allowed to substitute a non-copyable struct type
}
Of course, this can always be done later and isn't strictly necessary for an MVP. I'm just throwing some 🍝.
Would this work for e.g. SpinLock
which might be ok to pass byref
but definitely don't want to copy it as then the (now) two become unsynchronised https://github.com/dotnet/runtime/blob/82c705102a34c36c5395eb8804109f54c9750d90/src/libraries/System.Private.CoreLib/src/System/Threading/SpinLock.cs#L51-L53
@333fred, how would this play with how you're defining builders for interpolated strings? Presumably some builders (e.g. ones wrapping ArrayPool buffers) would benefit from being annotated as non-copyable, but the design as proposed today actually requires them to be passed around.
Would this work for e.g. SpinLock
Yes. More examples of NonCopyable types in Roslyn: https://github.com/dotnet/roslyn/search?q=%5BNonCopyable%5D
@jaredpar I tried to draft a list of restrictions a while back (see https://gist.github.com/GrabYourPitchforks/7ab6a440100467a82cfe5998cd1e91be). I saw Jan linked to this gist in an earlier comment, but I don't know how much it influenced this proposal or whether there was any kind of value judgment made over it. When you and I talked about this a few weeks back you had pointed out some rough spots.
@333fred, how would this play with how you're defining builders for interpolated strings? Presumably some builders (e.g. ones wrapping ArrayPool buffers) would benefit from being annotated as non-copyable, but the design as proposed today actually requires them to be passed around.
The answer for that will depend on what LDM thinks about builders passed by in
or ref
, I guess.
The answer for that will depend on what LDM thinks about builders passed by in or ref, I guess.
Yes... we should have an opinion / proposal, though.
The answer for that will depend on what LDM thinks about builders passed by in or ref, I guess. Yes... we should have an opinion / proposal, though.
I actually think this is a good motivating example for why we need to think about "move" semantics.
Consider that in
is unlikely to be the answer here. The builders need to be mutated in the calling function, if for nothing else than to call ToString
. That is a mutating function for the builders, ValueStringBuilder
returns items to the array pool, hence it's unlikely ToString
will be readonly
. That means it won't be callable on in
unless we first copy which violates NonCopyable
.
Passing by ref
will work but the design specifically allows for multiple builders to be in play here. That means we're going to have cases where multiple builders are being passed by ref
and the builders are likely to be ref struct
. That gets into territory where it gets easy to run afowl of ref struct
lifetime rules.
The idea of "moving" the value, essentially passing it to the function such that the calling function's copy is destroyed (assigned default
) seems like it is the most friction free option.
The idea of "moving" the value, essentially passing it to the function such that the calling function's copy is destroyed (assigned
default
) seems like it is the most friction free option.
That would mean it couldn't be used for SpinLock
as that would also destroy its locking semantics just as much as copying would.
That would mean it couldn't be used for SpinLock
We may need multiple levels:
SpinLock
)ValueStringBuilder
)I have had a closely related issue. I sometimes internally use a mutable struct to implement "fast list", which is just a wrapper of array, from which references of the elements can be obtained. I remember once I marked such a struct with readonly. As a result, adding elements fails because it cannot modify the field to the newly allocated array in case of initialization and reallocation.
So in some cases, if not all, I would suggest to have a way to declare that a struct should not be used as readonly field. I guess many move-is-ok noncopyable structs should fit in this category.
This is more like a language feature instead of runtime feature.
@benaadams @jkotas Can you explain why move semantics would not work for SpinLock
? Looking at the implementation, it appears that move semantics would be fine.
@sharwell I guess it's because someone might be waiting for the lock, effectively storing a reference to the struct on stack.
@sharwell, if "moving" involves copying and zero'ing out the original, that a) is not atomic / thread-safe and b) still leaves the original field as a default value which is a valid spin lock that's now disassociated from the copy.
@stephentoub A move is defined as the relocation of a value from location A to a different location B, after which point no code will access location A expecting to observe the original value. Atomicity of the move is not guaranteed; code which accesses the location is expected to either provide synchronization for the move or intentionally operate on a value which can be moved atomically. Zeroing the original value is not required for a move to be treated as a move, as any read access to the original location after a move is a semantic error.
So how are you envisioning that would work with a SpinLock:
private SpinLock _lock;
What code do two threads write to successfully use that lock? Today it's:
_lock.Enter();
...
_lock.Exit();
What is it when SpinLock is attributed as being moveable?
It depends entirely on the manner in which the instance is moved. For the case of StrongBox<SpinLock>
(or any SpinLock
stored as a field of a reference type), there is no special handling needed for the case where the compacting GC moves the storage location for the boxed value.
SpinLock is rarely used that way. It's typically a field in and of itself, with threads accessing that field. So how does it work in that case?
Maybe I've misunderstood what you were proposing. The earlier comment was that SpinLock shouldn't be copyable/moveable, and you asked what would be the problem with it being moveable. I'm trying to understand how you envision it being moveable safely.
Non-copyable value types are types where the user's code is responsible for managing the storage location for the value. Likewise, moving a non-copyable value type is an operation implemented in the user's code, with all the caveats faced by the compacting GC for reference types. Application misbehavior following failure to perform a move with correct semantics does not mean a correctly-performed move operation would have the same result.
Put another way: do we have examples of situations where a user is likely to misuse a SpinLock
without a warning if we implement it as non-copyable but movable?
Sam and I spoke offline. The disconnect stems from two completely different definitions of moveable:
@stephentoub thoughts about whether we should aim to do work here in 7.0 (either analyzer or language feature)? Seems there's a fair bit of interest in ValueStringBuilder alone.
@stephentoub thoughts about whether we should aim to do work here in 7.0 (either analyzer or language feature)?
In theory it's a valuable thing to do and would unblock a set of features we've been hesitant to do, making them enough less of a footgun that we'd then move forward with them. In practice the initial work associated with this would be coming up with the actual rules it would implement, and then seeing which of the things we've avoided doing would now be practical/still valuable given those rules. As is my typical want, we'd need to be ready to use it in multiple places in the same release that we add it. That goes whether it's done as a C# language/compiler or analyzer feature. We'd also need to work through any compat concerns with applying the attribute to types we've already shipped, and what level of noise we'd be ok with on existing code.
I've been stung by this twice with SpinLock
First
public readonly SpinLock Locker = new();
Second
public static void ExecuteLocked(this SpinLock locker)
{
// Above should have `ref`
}
Background and Motivation
New high-performance APIs are often exposed as structs to avoid GC heap allocation. Callers of such APIs have to be careful to avoid creating accidental copies of the struct. Failure to do so can lead to correctness or security issues. We need a capability to prevent or detect this class of bugs at build time.
ValueStringBuilder
is an example of an API where creating accidental copy is a potential security bug: https://github.com/dotnet/runtime/issues/25587#issuecomment-662076796Existing implementations of similar analyzers:
Proposed API
Promote https://github.com/ufcpp/NonCopyableAnalyzer developed by @ufcpp into .NET platform API so that it can be used by the platform itself.
Usage Examples
Alternative Designs
Full C# language feature. The difficulty in doing so is described in https://blog.paranoidcoding.com/2019/12/02/borrowing.html .
Risks
Corner cases that are missed by the analyzer, e.g. use of reflection.