dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.08k stars 4.69k forks source link

Erroneous linker error when using DAM attributes + primary constructors #101861

Open captainsafia opened 5 months ago

captainsafia commented 5 months ago

Description

The linker does not respect DynamicallyAccessedMembers attributes on parameters within a primary constructor.

internal sealed class TypeBasedOpenApiDocumentTransformer([DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] Type transformerType) : IOpenApiDocumentTransformer
{
    private readonly ObjectFactory _transformerFactory = ActivatorUtilities.CreateFactory(transformerType, []);
}

Note: this issue appears to repro specifically on the linker configuration used by ASP.NET Core's LinkabilityChecker tool. A more simplified repro on a console app doesn't share the same issue.

Reproduction Steps

  1. git clone --recursive https://github.com/dotnet/aspnetcore
  2. git checkout origin/safia/linker-primconstructors-bug
  3. ./restore.sh
  4. cd src/Tools/LinkabilityChecker
  5. dotnet run
  6. Observe linker exception.

Expected behavior

No linker exception is raised.

Actual behavior

Trim analysis warning IL2077: Microsoft.AspNetCore.OpenApi.TypeBasedOpenApiDocumentTransformer.TypeBasedOpenApiDocumentTransformer(Type):
 'instanceType' argument does not satisfy 'DynamicallyAccessedMemberTypes.PublicConstructors' in call to 
'Microsoft.Extensions.DependencyInjection.ActivatorUtilities.CreateFactory(Type, Type[])'. The field 
'Microsoft.AspNetCore.OpenApi.TypeBasedOpenApiDocumentTransformer.<transformerType>P' does not have matching 
annotations. The source value must declare at least the same requirements as those declared on the target location it is 
assigned to.

Regression?

No response

Known Workarounds

Explicitly defining a transformerType field with the DAM annotations resolves the issue.

Configuration

$ dotnet --info                                                                                                                                     11:38:54
.NET SDK:
 Version:           9.0.100-preview.5.24229.2
 Commit:            d301a122c4
 Workload version:  9.0.100-manifests.90bb8117
 MSBuild version:   17.11.0-preview-24222-11+9cdb3615a

Other information

No response

dotnet-policy-service[bot] commented 5 months ago

Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas See info in area-owners.md if you want to be subscribed.

sbomer commented 5 months ago

Simple repro:

using System;
using System.Diagnostics.CodeAnalysis;

new C(typeof(int)).UseField();

class C([DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] Type t) {
    public void UseField() => Require(t); // warning IL2077

    static void Require([DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] Type t) {}
}

This happens when the primary constructor parameter is turned into a field (for example, when it's accessed from an instance method). The field doesn't carry the annotation from the constructor parameter.

It feels like another case similar to https://github.com/dotnet/roslyn/issues/46646, but the compiler can't assume that the attribute supports AttributeTargets.Field. We may be able to figure out the mapping back to the constructor parameter from the compiler-generated field name, which is '<t>P' in this example.

@agocke @MichalStrehovsky

MichalStrehovsky commented 5 months ago

First of all, I'm surprised that C# doesn't allow this to compile:

class C([field: DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] Type t)
{
}

Unless we get a feature where Roslyn would "flow" these annotations for us in some way to backing fields, the best thing we could do is to allow people to do all this manually:

class C([param, field: DynamicallyAccessedMembers(DynamicallyAccessedMemberTypes.PublicConstructors)] Type t)
{
}

Allowing the param, field part would need new syntax, but is probably a lot cheaper than whatever feature to flow this to compiler-generated members by Roslyn. I don't know how realistic that request really is.

I'm not a fan of reverse engineering Roslyn's mangling for this. I was not a huge fan for lambdas/async state machines either. I don't know if we really want to be on a threadmill to reverse engineer manglings for all the new things C# introduces in the coming years (and whenever the manglings change, supporting the old version of the mangling forever, with the testability nightmares it comes with).

CyrusNajmabadi commented 5 months ago

It's an implementation detail if there is a field at all. And the language is very careful to not make it a requirement that a field be used to back a primary constructor.

MichalStrehovsky commented 5 months ago

It's an implementation detail if there is a field at all. And the language is very careful to not make it a requirement that a field be used to back a primary constructor.

If that's the case, it's not fixable on the trimming analysis side - unless some things around this become contractual. Not being able to use primary constructors when dataflow annotations are needed doesn't seem like such a huge loss IMHO.

agocke commented 5 months ago

I agree with Michal, we should try to move forward with discussions on how to lift the attributes in compiler generated code.

@CyrusNajmabadi this isn’t really a language question. The core problem is that IL carries extra semantics when doing trimming. In this case the compiler is unaware of those semantics, but that breaks an end to end user feature. We need to find some way to indicate to the compiler that these attributes are important at the IL level, similar to MethodImpl or DllImport

jaredpar commented 5 months ago

I agree with Michal, we should try to move forward with discussions on how to lift the attributes in compiler generated code.

Assuming that "lift" here means getting the attributes from the declaration in source to the actual member in metadata. Assuming that ...

I've previously outlined the state of the compiler and not much has changed there. To make progress on this I think we need essentially two items:

  1. Get LDM / compiler team to move how attributes flow out of implementation detail status and to a more rigorously defined feature. I'm hesitant to elevate it to fully supported at say the level of language syntax. Doing so would essentially constrict our ability to evolve code generation. I think we can find a middle ground of guaranteeing where / how attributes flow for a particular version of our code gen. The expectation being that consumers may need to change if we change how code is emitted.
  2. Get a design for how this would work. The compiler is going to flow every single attribute into generated code. Need to have a way to define which attributes are important, which locations need the flow, etc ...

I don't know if we really want to be on a threadmill to reverse engineer manglings for all the new things C# introduces in the coming years (and whenever the manglings change, supporting the old version of the mangling forever, with the testability nightmares it comes with).

I also feel like we're on a bit of a treadmill here. This has come up several times and each time I've pushed it back to the runtime team for help on outlining scenarios, finding a general solution, etc ... This is something the compiler team can't just make up by ourselves. But that is where the conversation generally stops cause we don't get a proposal / scenarios sent back to think through. If you want this to change then the next step would be to get a more comphrehensive set of scenarios written down that we can work through together.

agocke commented 5 months ago

I also feel like we're on a bit of a treadmill here. This has come up several times and each time I've pushed it back to the runtime team for help on outlining scenarios, finding a general solution, etc

This is on me. I deprioritized this work because it wouldn't really help with old binaries, which we will have to consume in perpetuity. I didn't anticipate that there would be more features that behave like lambdas and lift variables, and the benefit would primarily be for new features.

To be clear, I'm not blaming anyone for the current state of things (except myself). The current ambiguity is exactly where I intended it to be. I'm proposing re-evaluating the priority of this project (on my side as well) and giving a heads-up to partners.

CyrusNajmabadi commented 5 months ago

@CyrusNajmabadi this isn’t really a language question. The core problem is that IL carries extra semantics when doing trimming. In this case the compiler is unaware of those semantics, but that breaks an end to end user feature. We need to find some way to indicate to the compiler that these attributes are important at the IL level, similar to MethodImpl or DllImport

My general concern though is that this locks the compiler into a particular emit strategy afaict. It would be codifying that primary constructor parameters must be fields, right? Perhaps that is ok. But i remember us working very hard to ensure that that was not something you could/should ever expect a guarantee about.

I get the concern here about trimming. I'm just trying to reconcile the desire between:

  1. you cannot be guaranteed to get a particular impl, and
  2. you need to be guaranteed to get a particular impl that trimming can depend on.
agocke commented 5 months ago

It would be codifying that primary constructor parameters must be fields, right?

Nope. It would require that if a field is generated, that field must propagate the trimming attributes from the parameter. Effectively, this is saying that the attributes are a part of the type. They must be preserved through compiler transformations to ensure correctness. It's not really different from ensuring that, if a type parameter is alpha-renamed, the constraints must be preserved.

jaredpar commented 5 months ago

The problems I'm seeing from these discussions is essentially two fold:

  1. The compiler does not always preserve attributes in source when those constructs are translated to metadata. This is problematic when the source construct is part of a compiler feature that does significant rewriting: closures, state machines, primary ctors, etc ...
  2. The compiler makes no guarantees to the metadata representation of certain features: primary ctors, closures, etc ... That leads to the trimming team reverse engineering our strategy.

Nope. It would require that if a field is generated, that field must propagate the trimming attributes from the parameter.

I don't think this is quite right. There is a bit of an implicit assumption in that statement the way primary ctors are implemented are fields. The spec though doesn't require that. We could for horror sake implement primary ctors using a ConditiontalWeakTable. At which point the trimmers reverse engineering efforts are likely .. harder.

My general concern though is that this locks the compiler into a particular emit strategy afaict.

I tend to think more along these lines. Like it or not there is a bit of a contract between the teams on how emit works today. Compiler can say it's an implementation detail but I also know if we changed how lambdas were emitted it would have an impact on the trimmer.

I don't want to get into a place where we're rigidly defining how the compiler emits metadata here. Essentially anything where it would prevent us from exploring new emit strategies. Think for the particular cases here though we can find some middle ground to work with.

agocke commented 5 months ago

Think for the particular cases here though we can find some middle ground to work with.

Yeah, that's right. I think the story is kind of in-between. Trimming doesn't require one particular IL representation, but it does only support a limited set of options today. So the compiler can choose some options and not others.

At the same time, this doesn't really concern me from a high-level perspective. To put on my former Roslyn hat, as a compiler front-end, Roslyn is always going to be limited by what the backend supports/requires. Producing a cohesive cross-cutting feature will require adjustment on both sides.

In fact, I think this is the realization of a lot of what we've been working towards: the capability to evolve the compiler and runtime over time. For the first few releases this mostly consisted of the language designing a feature first, and the runtime evolving support for that feature. This is the other way around: the runtime designing a feature and the compiler evolving support for that feature. As long as we keep discussion going, I think this is the ideal way for us to do bottom-up product development.