dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.92k stars 4.02k forks source link

[Proposal] add supersede modifier to enable more tool generated code scenarios. #5292

Closed mattwar closed 7 years ago

mattwar commented 9 years ago

Some C# language features exist primarily for tooling scenarios. Designers, build time tools and other code generators produce source code that compiles together with and augments user written code. Partial classes and partial method are two of these.

With partial classes, a tool can generate source files that make additional declarations for a class, adding members, interfaces and attributes on top of what the user wrote.

Partial methods allow tools to declare signatures for methods that don’t actually exist, but that can be supplied elsewhere by the user. In this way the tool is free to generate code invoking this partial method without requiring the method to have been fully supplied. They are limited to private void returning methods so that they can vanish during compilation if the user has not supplied a full declaration with a body for the partial method.

In this way, partial methods are a clever trick enabling tool generated code to interact with user written code in more than just the simple additive way that partial classes provide. However, it is a very narrow trick. It allows the user to enhance the behavior of the tool generated code at well-defined spots by providing method implementations, but it does not allow the tool generated code to enhance the behavior of the user written code.

For example, imagine I wanted a tool to generate code that adds entry and exit logging to my user written code (without actually changing my source files or post processing the binary.) I could pervert the use of partial methods to approximate this, by inverting how they are normally used. I could pre-establish an agreement between my code and the tool (probably established by the tool), and liberally sprinkle invocations to logging methods throughout my code base, and then let the tool generate the bodies.

// user written code
public partial class MyClass
{
    private partial void log_entry(string name);
    private partial void log_exit(string name);

     public void Method1() 
     {
          log_entry(“Method1”); 
          // something totally awesome here
          log_exit(“Method1”);
     }
}

// tool generated code
public partial class MyClass
{
    private partial void log_entry(string name)
    {
         Console.WriteLine(“entered {0}”, name);
    }
    private partial void log_exit(string name);
    {
        Console.WriteLIne(“exited {0}”, name);
    }
}

Then I could choose to run the tool or not, so that the calls to log_entry and log_exit either had an implementation at runtime or not.

But I would have to manually place calls to these methods everywhere throughout my code. And I’d have to do this over and over again for each additional tool that needed to interact with my code. A very tedious prospect, and one better suited for tools generating code than my tired fingers.

A better solution would be one that allowed the generated code to enhance the user written code directly, by replacing it or overriding it, and placing all the burden of declaration into the tool generated code.

Proposal: Superseding members

The supersedes feature enables just this. It allows class members to override members defined in the same class, by superseding their declaration with a new declaration using the supersede declaration modifier.

// user written code
public partial class MyClass
{
      public void Method1()
      {
          // something totally awesome here
      }
}

// tool generated code
public partial class MyClass
{
      public supersede void Method1()
      {
           Console.WriteLine(“entered Method1”);
           superseded();
           Consolel.WriteLine(“exited Method1”);
      }
}

In this scenario the tool generated Method1 supersedes the user written Method1. All invocations are directed to the tool generated method. The tool generated implementation can then invoke the user written method using the superseded keyword that points at the member that was superseded by this declaration.

Using this technique, I could have a tool generate my boilerplate INotifyPropertyChanged implementations.

// user written code
public partial class MyClass
{
     public int Property { get; set; }
}

// tool generated code
public partial class MyClass  : INotifyPropertyChanged
{
     public event PropertyChangedEventHandler PropertyChanged;

     public supersede int Property
     {
          get { return superseded; }
          set
          {
              if (superseded != value)
              {
                    superseded = value;
                    var ev = this.PropertyChanged;
                    if (ev != null) ev(this, new PropertyChangedEventArgs(“Property”));
              }
          }
     } 
}

Multiple Generators

Of course, since the primary scenario for this feature is tool generated code and there is likely going to be times where there is more than one tool wanting to supersede a user written member, some mechanism should exist that allows one bit of tool generated code to supersede another bit of tool generated code without requiring them to know about each other.

Proposal: superseding superseded members

We can allow more than one superseded member declaration to exist at the same time. One of the superseding members directly supersedes the original member. Another superseding member then supersedes the prior superseding member, and so on. Thus a chain of superseding members exists similar to how multiple overrides might exist on the same member in an inheritance hierarchy. In fact, it is somewhat helpful to think of superseding as directly comparable to overriding. If you imagine a series of overrides of a member in a vertical inheritance hierarchy, you could image superseding as kind of a horizontal overriding (between partial class declarations).

// user written code
public partial class MyClass
{
      public void Method1()
      {
          // something totally awesome here
      }
}

// tool #1 generated code
public partial class MyClass
{
      public supersede void Method1()
      {
           // do something interesting
           superseded();
      }
}

// tool #2 generated code
public partial class MyClass
{
      public supersede void Method1()
      {
           // do something else interesting
           superseded();
      }
}

Order of supersession

The only problem we have is determine which superseding member is the first, the next and so on. The order is going to matter, because it determines which gets called first and that can affect behavior of the entire composition.

Unlike member overrides that occur in derived classes that specifically call out the names of the classes they are deriving from, the supersede syntax doesn’t directly identify the precise member that is being superseded. And that’s a good thing, because if it did, then code generating tools would absolutely need to know about each other, so they could reference each other’s declarations.

Instead, we need a different technique (other than naming) to identify ordering between the superseded members.

Proposal: lexical order

We can use lexical order to determine the order in which members are superseded. (For multiple source files, consider the files concatenated in the order supplied to the compiler.)

  1. The member without the supersede modifier is the base implementation.
  2. The superseding member that is earliest in lexical order is the first to supersede the base.
  3. The superseding member that is next in lexical order supersedes the previous superseding member, and so on.

In this way, no additional syntax needs to be invented to support having multiple superseding members for a single base member. The natural order of their declaration determine the order for superseding.

(This does not consider how the generated source files are given their order.)

HaloFour commented 8 years ago

@IanRingrose As the entire point of this feature is to provide AOP/type-provider functionality to automatically perform a lot of what would be boilerplate requiring new boilerplate to even allow the boilerplate would just be a lot of unnecessary noise. The developer is already opting into using said code generator by adding it to their project and following its conventions for decorating the target classes.

IanRingrose commented 8 years ago

Then way require the class to be marked partial at all?

It may be best just to require an Attribute to control AOP, with the Attribute enabling AOP on all code at or below the level it is put on.

The workflow of compile, run code generator, recompile (with generated code) also need considering. This has to work with ALL C# compilers not just roslyn. The code generator also must be able to be written in a way that works with all C# compilers.

HaloFour commented 8 years ago

@IanRingrose

Indeed, it could be argued that requiring the partial modifier on the class is also extraneous, which is why #6953 exists.

It may be best just to require an Attribute to control AOP, with the Attribute enabling AOP on all code at or below the level it is put on.

I expect most generators will rely on attributes. Requiring another attribute seems, again, like useless boilerplate. As a developer I have the choice to simply not add the generator to my project.

The workflow of compile, run code generator, recompile (with generated code) also need considering.

IIRC the generator runs as a part of the compilation? It should be a single operation.

This has to work with ALL C# compilers not just roslyn. The code generator also must be able to be written in a way that works with all C# compilers.

Sure, as long as all other C# compilers fork from Roslyn then that will be possible. Beyond that, it's simply a ludicrous expectation given that generators, like analyzers, will definitely rely heavily on Roslyn APIs and ASTs.

IanRingrose commented 8 years ago

Then just define a Roslyn API to replace a method, without any need for adding a new keyword to C# at all. (Require that the Roslyn compiler MUST ALWAYS output a information line if a "add in" modifies its output. Think of a virus that spreads by using AOP....)

HaloFour commented 8 years ago

@IanRingrose

There isn't an API to "replace" the method, that's just source emitted to a partial class. You can type that by hand if you wanted to. The Roslyn API, and the guts of the generator, is the analyzer which scans the code to figure out if/what additional source compilation units to inject into the current compilation.

HaloFour commented 8 years ago

@IanRingrose

Let me back that up. There is a Roslyn API, as described here:

https://github.com/dotnet/roslyn/blob/features/generators/docs/features/generators.md

That API very intentionally just allows the generator to inject additional source into the current compilation. That source may decorate existing members through the use of replace but that's only one possible scenario.

I expect that Roslyn will output details regarding the generators that were included as a part of the compilation, and tools will support seeing the generated source.

danm36 commented 8 years ago

With regards to order of superseding when multiple replacements are used. One potential method to help alleviate this could be an attribute that hints to the compiler desired order, so something like:

//User code
public void DoSomething(int x, int y)
{
    //Some code
}

//Tool code 1
[ReplaceOrder(EReplaceOrder.Early)] //Replaced as soon as possible
public replace void DoSomething(int x, int y)
{
    logger.log("DoSomething() Started");
    x *= 2;
    original(x, y);
    logger.log("DoSomething() Ended");
}

//Tool code 2
[ReplaceOrder(EReplaceOrder.Late)] //Replaced as late as possible
public replace void DoSomething(int x, int y)
{
    BeginProfiling();
    original(x, y);
    EndProfiling();
}

//Resulting code would be
public void DoSomething(int x, int y)
{
    BeginProfiling();
    logger.log("DoSomething() Started");
    x *= 2;
    //Some code
    logger.log("DoSomething() Ended");
    EndProfiling();
}

Then, any methods that share the same ReplaceOrder value - however it's implemented - are evaluated and compiled in the usual lexical order. This would act purely as a hinting system so the methods that critically must wrap around all content can request that they are the last (Or nearly the last) method to be considered with replacement, while methods that perform any critical last second processing can request that they are considered for replacement first or at least as early as possible. Methods without the attribute would simply be considered 'Normal' and compile between Early and Late.

Being an attribute would negate the need to add more key words and if a compiler - for some reason - implemented superseding but not the attribute, the code could still compile fine anyway assuming the user adds a blank placeholder attribute. My use of an enum is just for example, although I feel it might be better than a numeric ordering system as multiple tools could start fighting for the highest or lowest number so they can ensure their own order (Something zindex in CSS often shows). Using an enum, they can't try to force a specific order with extremely large numbers, but they can try to get an order that suits them best.

MgSam commented 8 years ago

@danm36 You need a lot more fine grained control than just an enum that specifies early, late, etc. If you have 3 generators, you need to be able to easily order them, and insert a fourth into the order at a later time if necessary. I can't think of any better mechanism for this than real numbers.

TonyValenti commented 8 years ago

I've been thinking that The order should be something that is specified by the project. Just as references are per project, I think that if you order that generators and source code modifiers are run at should be specified at a project level as well.

On Tuesday, June 28, 2016, danm36 notifications@github.com wrote:

With regards to order of superseding when multiple tools are in use. One potential method to help alleviate this could be an attribute that hints to the compiler desired order, so something like:

//User code public void DoSomething(int x, int y) { //Some code }

//Tool code 1 [ReplaceOrder(EReplaceOrder.Early)] //Replaced as soon as possible public replace void DoSomething(int x, int y) { logger.log("DoSomething() Started"); x *= 2; original(x, y); logger.log("DoSomething() Ended"); }

//Tool code 2 [ReplaceOrder(EReplaceOrder.Late)] //Replaced as late as possible public replace void DoSomething(int x, int y) { BeginProfiling(); original(x, y); EndProfiling(); }

//Resulting code would be public void DoSomething(int x, int y) { BeginProfiling(); logger.log("DoSomething() Started"); x *= 2; //Some code logger.log("DoSomething() Ended"); EndProfiling(); }

Then, any methods that share the same ReplaceOrder value - however it's implemented - are evaluated and compiled in the usual lexical order. This would act purely as a hinting system so the methods that critically must wrap around all content can request that they are the last (Or nearly the last) method to be considered with replacement, while methods that perform any critical last second processing can request that they are considered for replacement first or at least as early as possible. Methods without the attribute would simply be considered 'Normal' and compile between Early and Late.

Being an attribute would negate the need to add more key words and if a compiler - for some reason - implemented superseding but not the attribute, the code could still compile fine anyway assuming the user adds a blank placeholder attribute. My use of an enum is just for example, although I feel it might be better than a numeric ordering system as multiple tools could start fighting for the highest or lowest number so they can ensure their own order (Something zindex in CSS often shows). Using an enum, they can't try to force a specific order with extremely large numbers, but they can try to get an order that suits them best.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/roslyn/issues/5292#issuecomment-229039421, or mute the thread https://github.com/notifications/unsubscribe/AM-qVkvs9_bon47LoK20gxoB5O5NSZv0ks5qQRjfgaJpZM4F-yoq .

Tony Valenti

IanRingrose commented 8 years ago

Given that project files often get messed up on code merges etc, it seem high risk to depend on them for something important.

danm36 commented 8 years ago

@MgSam Then perhaps support both, with Early == -5, Normal = 0 and Late == 5 (Or some other values with enough space between for integer replacements that are earlier than 'Late' or later than 'Early'), so that tools can choose to be loosely ordered via an enum or strictly ordered via an integer. Once ordered via this attribute, methods that share the same order value are ordered per their file name and location within the file (Or some other ordering algorithm).

IanRingrose commented 8 years ago

Once ordered via this attribute, methods that share the same order value should give an error. Anything else will make debugging too hard, as rename a file should not change the meaning of code.

danm36 commented 8 years ago

Instead of an error, perhaps just a warning (And only if they were ordered by an integer)? The initial use case for this was primarily for tool generated code, and - depending on the tool - it could be impossible to fix as the code and/or attribute could be regenerated by the tool on the next compile. Locking out a user from compiling their code because two of their code generators use the same order value seems like a bad move. With a warning, user written code can be flagged if there is an order conflict, but tool generated code an disable the warning for their section as the tool is aware that there could be conflicts with other tool generated code. Code without the order attribute, or using one of the 'loose' order enums would be exempt from the warning as they have indicated no need for strict replacement ordering.

HaloFour commented 8 years ago

I agree, it probably shouldn't matter if multiple replace members happened to be marked with the same priority. I assume that most generators won't care and would share the same "normal" priority, or just omit the attribute altogether, which should produce the same effect. The question is how to then ensure that the members are replaced in a deterministic order, which I would argue should be in the same order as the analyzers/generators are run, which should be in the same order as the analyzers/generators are referenced in the project.

mattwar commented 8 years ago

The intent is to have the generators run in the order they are specified to the compiler, against declarations in lexical order (with documents also ordered as they are specified to the compiler.)

The trouble is going to be communicating that order through the tools like VS solution explorer which orders documents by name, and the mechanism that adds documents to project files which may be arbitrary. Of course, by editing the project file by hand you can easily manipulate document and generator order.

TonyValenti commented 8 years ago

I think that, perhaps, it would be better if it was exposed as a tab in the project properties, not in the tree itself.

On Tuesday, June 28, 2016, Matt Warren notifications@github.com wrote:

The intent is to have the generators run in the order they are specified to the compiler, against declarations in lexical order (with documents also ordered as they are specified to the compiler.)

The trouble is going to be communicating that order through the tools like VS solution explorer which orders documents by name, and the mechanism that adds documents to project files which may be arbitrary. Of course, by editing the project file by hand you can easily manipulate document and generator order.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/roslyn/issues/5292#issuecomment-229117575, or mute the thread https://github.com/notifications/unsubscribe/AM-qVgmRLy63nNDr4CPAtd5Vc5PnKzW6ks5qQVcbgaJpZM4F-yoq .

Tony Valenti

MgSam commented 8 years ago

@mattwar I don't think that is a good solution. You shouldn't have two identical sets of code files that behave differently depending on the order of compilation. There's nothing that behaves like that currently in the language (as far as I'm aware).

Also, as someone writing code I shouldn't have to look beyond the file I'm editing to wonder what the behavior of the code will be. Except in a few corner cases (overflow checking) that's true today.

Is there a reason you guys don't want to explicitly specify the order using a number? It's worked well for PostSharp, it seems odd that you'd deviate significantly from an established, successful product that already adds this sort of capability to C#.

mattwar commented 8 years ago

@mgsam There are two different issue here. One is determinism, since order of application affects the output how does the compiler choose the order that the generators are run and the order of the declarations they are run against. The second is user ability to adjust that order. Having generators declare their order in some way is a means to let the user (or author in this case) adjust the order. But there has to be an order in the first place for there to be determinism.

maxtoroq commented 8 years ago

This is similar to import precedence in XSLT, except in XSLT source files are explicitly linked together via xsl:import, so the order problem doesn't exist.

The idea of using an attribute to define the order is not bad, but it should be at the partial class level, otherwise organization can get confusing and difficult to follow if different partials use different priorities for different members

Please don't use original, it has a different meaning in XSLT 3.0.

EricGarnier commented 7 years ago

Hi all, why not use an attribut on the generated side, to indicate which member is superseded? In this way, we don't have to add any new keyword in C#. And the original method and the one generated by the tool can have different name.

Something like this :

// user written code
public partial class MyClass
{
      public void Method1()
      {
          // something totally awesome here
      }
}

// tool generated code
public partial class MyClass
{
      [Supersede(nameof(Method1))]
      public void ToolGeneratedName()
      {
           Console.WriteLine(“entered Method1”);
           Method1();
           Consolel.WriteLine(“exited Method1”);
      }
}

The "Supersede" attribut signals to the compiler that this method will replace the original "Method1". Inside the method decorated with this attribut, the name "Method1" refers to the user written code, and outside, to the method generated by the tool.

We can have an order on the attribut to help the compiler sort out the order of superseding.

This can be implement either in the compiler or in an post compilation process.

vladd commented 7 years ago

@EricGarnier Well, the question is whether our goal is to avoid inclusion of new keywords. First of all, what are we going to do if Method1 is overloaded? Are we going to invent some special syntax for this case or ask the compiler to choose the method with the same signature? What is the method is an indexer or explicit interface implementation? Next. with the attribute, the reader just sees some unintelligible name, so we are kind of lying to the reader. What if ToolGeneratedName is actually used in the class and makes ambiguity? Should the parsing rules depend so strictly on just one attribute? Must the IDE and all the tools know that the generated name is bogus, and may be ignored when e.g. the user creates another function with the duplicated name but without the attribute? From my point of view your proposal contradicts the rule-of-thumb "attributes can be roughly ignored when one reads the code"; you propose a very deep change of the language expressive means just for avoiding a single new keyword.

pr-yemibedu commented 7 years ago

Decomposing the original to return new types or rearrange parameters would feel better through a limited macro system. For tooling, just getting extra code slices is enough. Here are keyword names I like. There probably should generator entries in the .xxproj file read (FIFO) and if not then msbuild should get an explicit listing of files for compilation. Attributes and source magic number ordering is too constraining.

// user written code
public partial class MyClass
{
      public augment string Property1 {get;} = "awesome";

      public augment void Method1()
      {
          // something totally awesome here
      }
}

// tool #1 generated code
public partial class MyClass
{
      public transform void Method1()
      {
           // do something interesting
           primary();
      }
}

// tool #2 generated code
public partial class MyClass
{
      public transform string Property1()
      { get =>{
           // do something else interesting
           return primary;}
      }
}
gafter commented 7 years ago

This proposal is now being tracked at https://github.com/dotnet/csharplang/issues/107