dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.55k stars 4.54k forks source link

Detect incorrect usage of 'in' modifier #30535

Open steveharter opened 4 years ago

steveharter commented 4 years ago

When using the in modifier on structs passed as method arguments, subsequent calls to the struct's members may cause a performance penalty because a defensive copy of the struct is made. This defensive copy is made when a mutation could occur, such as when calling a method, accessing a property, or setting a field.

For a non-mutable struct, to prevent the defensive copy the readonly modifier should be added to the struct itself and any fields and non-auto properties.

For a mutable struct, the readonly modifier should be added to the fields\properties\methods that do not mutate. However it may be safer to use the ref modifier instead of in in these cases because:

Based on these I assume the following best practices: 1) Do not mutate a struct in a method where it was passed with in. Instead remove the in and let the copy happen automatically up front or use the ref modifier if that is the desired behavior (see next line below). 2) Pass mutable structs with ref instead of in. If the struct was properly readonly-adorned then in could be used reliably given verification checks (analyzers or compiler warnings). 3) All structs should be adorned with readonly as much as possible so consumers can reliably use the in modifier without concern for perf due to the defensive copy. Note that the in modifier was recently added (C# 7.2) so previously existing structs are not likely adorned.

In summary, the in modifier will be commonly misused because it is not obvious that a defensive copy is done, or even allowed to occur. In addition, any members called that do mutate the struct are "silent" which will often be a bug. In order to properly use in one must:

Thoughts to address this so in can be reliably used on both mutable and immutable structs:

(for lack of a well-known home for this issue, it was created in corefx with area-infrastructure).

ViktorHofer commented 4 years ago

@stephentoub opinions? and do we have an analyzer for the correct usage of ref and in? (if flow analysis allows that)

stephentoub commented 4 years ago

opinions? and do we have an analyzer for the correct usage of ref and in? (if flow analysis allows that)

We do not. The only one I'm aware of that exists today is the one Steve linked to, as discussed here: https://github.com/dotnet/corefx/pull/40114#issuecomment-519580815

Gnbrkm41 commented 4 years ago

related: https://github.com/dotnet/corefx/issues/36586 (regarding adorning existing types with readonly)

ViktorHofer commented 4 years ago

I think adding an analyzer for the correct usage for the in keyword seems right. That said I'm hesitant to add a dependency to ErrorProne.NET (To me it looks like CI isn't working / turned on and the analyzers seem to have some issues on .NET Framework based on some opened issues).

@steveharter what do you suggest to do here? Create such a roslyn analyzer in https://github.com/dotnet/roslyn-analyzers? If so then I suggest we move the issue over.

tannergooding commented 4 years ago

CC. @jaredpar

Warning Waves would be nice, so the compiler can properly warn on this.

danmoseley commented 3 years ago

This seems not 5.0 work.

Sergio0694 commented 3 years ago

Are there any updates on the status of this issue, and/or does this have a chance of making it into .NET 6? The fact that the in modifier can do hidden copies that are very easy to miss is a really inconvenient aspect of that feature (especially because the whole point of in was that it would only be used in performance critical scenarios) and it has been discussed before already, but currently there are still no built-in solutions to remedy that. Re# offers partial support for this mostly when invoking impure methods in readonly fields, but that's not really the same, it doesn't support in parameters and most importantly this is such an important aspect for developers using in that it should really be built-in.

I should also add that on top of being a performance concern, there are scenarios where a hidden copy would actually be breaking. Consider this simple extension method I wrote for the ComPtr<T> type from @tannergooding's TerraFX library:

public static unsafe void** GetVoidAddressOf<T>(this in ComPtr<T> ptr)
    where T : unmanaged
{
    return (void**)Unsafe.AsPointer(ref Unsafe.AsRef(in ptr));
}

Which you'd use like so:

using ComPtr<IFoo> foo = default;

int result = bar->GetFoo(foo.GetVoidAddressOf());

// Check result and throw, and use foo here...

You can see how if the compiler did a shadow copy, the returned void** pointer would be pointing to the shadow copy value instead, which in this case would result in the code just throwing away the result value, foo remaining a null pointer, and also a memory leak for that returned IFoo COM object, that'd never be tracked.

This was just an example (and in this case that code in particular is fine), but I had to ask for confirmation and I had no way to just be sure by looking at the code, because we have no built-in analyzer for that today, so errors are very hard to spot.

I'd say it's more than likely there's plenty of developers out there with lots of (no breaking, but still) shadow copy in their libraries, that are just unaware of what is actually happening in their codegen. Since this feature is specifically an opt-in for developers that are seeking to make their code faster and reduce copies, I'd argue it makes sense to better support this scenario 🙂