dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.02k stars 4.03k forks source link

[Req] Public api for compile-time codegen? #2205

Closed ig-sinicyn closed 8 years ago

ig-sinicyn commented 9 years ago

Hi!

The metaprogramming feature was discussed multiple times already (e.g. #98 and #2136) but all of discussions as so far ended with "too big to fit" resolution.

At the same time there are a lot of real-world codegen tasks, such as

Each of these uses its own code generation approach, but all of them work fine, nothing fails and no one hurts. So, lets start with assumption that adding one more way to apply some build-time-magic will not break anything.

Now, there are a lot of issues not related to the c# syntax and too specific to be supported directly by the c# compiler. To name a few: #1677 (support for INPC), dependency properties, #105 (easy Equals() implementation) and so on. It seems that allowing to add custom code rewriters is the only solution that can be safely done at c# side.

All of these issues perfectly fits into following restrictions:

In summary:

  1. Most of "metaprogramming, please!" requests can be covered with postsharp-like code rewriters.
  2. Biggest part of work is done already. Roslyn API is here, there's just no easy way to plug in custom rewrtiters.
  3. Adding support for custom rewriters will not be a principal problem. Roslyn supports custom code analysers already, why not allow rewriters too?
  4. There will be no versioning issues. Each assembly could use its own set of code rewriters and replacing one implementation with another will not break dependent code (at least until rewriters do not mess with assemly' public api).

So, what I've missed?:)

AdamSpeight2008 commented 9 years ago

I think it should be easier to express code, rather than building the node syntax tree. The compiler itself could be used, to help (see #174)

ig-sinicyn commented 9 years ago

@AdamSpeight2008 Well, I'm not sure that macros addition is the right way to go.

As for me it is too big and controversial feature to be easily fitted into c#. Also, the main benefit from macros - no need to construct code tree manually - is perfectly covered by

var stubCode = SyntaxFactory.ParseExpression(@"
public StubType StubProperty
{
  get { return stubField; }
  set
  {
    stubField = value;
    OnPropertyChanged(StubPropertyName);
  }
}")

All you have to do is replace stubs with actual nodes. And if you're too lazy, it can done using string interpolation,

var stubCode = SyntaxFactory.ParseExpression(@$"
public {typeName} {propertyName}
{
  get { return {fieldName}; }
  set
  {
    {fieldName} = value;
    OnPropertyChanged({propertyName});
  }
}")

thats all. If there're more real benefits from macro approach I'd be glad to hear it.

Now, here is what's wrong with macros:

  1. It's viral. Macros addition will significantly change the way code is written, like a linq, await and generics did. So, there will be no chance to fail, throw the errors out and redo it all over again. It's going to be extremely hard task and 'extremely' is underestimation here. Look at Nemerle 'macro-all-around' language. Despite very good, clean and versatile design, in Nemerle 2.0 they had to redo entire macro system from scratch. Now it is based on PEG grammar and it turns out that N2.0 core evolves into DSL generator on steroids. Too big change for general purpose lang, I guess. Sadly, there's not so much public information about Nemerle in English. As far as I know the core of Nemerle community is Russian-speaking developers and Nemerle 2.0 design is in early stage for now. So I'm afraid there is no detailed description in English.
  2. It's viral-2. With macros/templates it's too easy to introduce new template types and language constructs that had to be used in dependent assemblies too. So, welcome to the version hell collectors edition. Yes, I know, theoretically no one will do that. In practice, they will :(
  3. It's complex. Using rewriters guarantees that the minimal scope is a type member. With macros you can tune anything starting with a single expression. This means circular dependencies, order of applying matters, surprisingly odd side effects like this and so on.
  4. It's too easy to use. Look at Roslyn feature request discussions. Specification for "seems-to-be-simple" string interpolation feature was significantly changed at least twice. Here's v5 edition of nameof operator. "Simpiler" params IEnumerable and digit_separapors were not implemented at all. Yes, designing a new language feature is hard. Despite all efforts there always be some corner cases where proposed design will not fit. Imagine now that you had to support the code filled with all bad-designed features you may wish. What, "no one will do that" again? Well, why then spend time on feature the majority will not be able to use the right way?
  5. It's costly. I'm talking about cost of macro addition and support, not about using the macro that was written by someone else. You would need to invent a totally new set of use cases, there would be significant changes in Roslyn infrastructure, tooling support would take a lot effort, too. Compared with code rewriters, it's like "plain switch" vs "full-featured pattern matching".

As a summary: code rewriters offer almost ideal balance between developer qualification, scenarios supported and cost of support. As for me there's no place for macros-driven metaprogramming in c#. However I would love to hear if I'm wrong.

ilmax commented 9 years ago

I would like to upvote this proposal, seems like a natural addition to Roslyn and could allow a broader set of feature without chenging the language itself. There are a tons of possible use cases, perf improvement of serialization and aop just to name a few

AdamSpeight2008 commented 9 years ago

@ig-sinicyn wrote: Also, the main benefit from macros - no need to construct code tree manually - is perfectly covered by

var stubCode = SyntaxFactory.ParseExpression(@"
public StubType StubProperty
{
  get { return stubField; }
  set
  {
   stubField = value;
    OnPropertyChanged(StubPropertyName);
  }
}")

All you have to do is replace stubs with actual nodes. And if you're too lazy, it can done using string interpolation,

You're right you can do it using a string, then converting it at runtime. In which lies an issue, you don't get compile-time type-checking of that section of code. My proposal / idea is to have compile-time checked templates. (like C++)

ig-sinicyn commented 9 years ago

@AdamSpeight2008 Emm... I'm not talking about runtime:)

SyntaxFactory.ParseExpression() is common shortcut to get correct syntax three without doing something like this. After you've got valid AST it's relatively easy to replace with it the code you need to rewrite. Here's a roslyn code fix sample using the same approach.

So, the code above is meant to be executed at compile time as additional compilation step. No runtime magic at all.

If you want to prove that rewriter/macro is working correctly you had to write tests anyway. There always are some errors that cannot be caught by compiler.

AdamSpeight2008 commented 9 years ago

@ig-sinicyn The example from the first link, would look something like this use a template function.

template syntaxnode FormWithTicker( string ns , string fn ) 
{
  namespace %{ns}%
  {
    public class %{fn}%() : System.Windows.Forms.Form
    {
      public System.Windows.Forms.Timer Ticker
      {
        get; set;
      }

      [STAThread]
      public void Main
      {

      }
   }
}

The compiler would / should check at compile-time that the structure of the code is correct.

ig-sinicyn commented 9 years ago

Well, from my point of view it is useless to check template itself.

As example, template code can derive from type not referenced by the template library. It could be derived from another template. It could use mixin-style template combinations. Or use members from type being passed as template argument. And so on and so on.

So, the only way to check template is to apply it to the real code. And at this moment there will be no difference between macros, code template and classical code rewriter.

Also, did you notice that your template could be easily replaced with usual base class? Having multiple ways to do something usually means that there's something wrong with language design. One more point for code rewriters:)

It seems that the discussion begins to look like "macro vs aop" holywar, so I'll stop for mow. Thanks, and let's wait for comments from someone from roslyn team:)

russpowers commented 9 years ago

I've written a working implementation of Roslyn compiler plugins similar to this, along with attribute macros as described. Take a look at #3248.

gafter commented 8 years ago

In progress in #5561.