dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.71k stars 3.98k forks source link

Source Generators: expose a SyntaxGenerator #43821

Open canton7 opened 4 years ago

canton7 commented 4 years ago

It would be good to have access to a SyntaxGenerator on the SourceGeneratorContext, or provide another means of obtaining one.

There are some useful tools on SyntaxGenerator which it would be handy not to have to re-implement, and it should also help with creating language-agonstic syntax generators. Working with SyntaxNodes also gives access to tools like .NormalizeWhitespace().


Specifically, I'm investigating writing a source generator for RestEase. I need to generate an implementation of a user's interface, where that interface might be in a referenced assembly I can't access the source for. Because the generated type might ultimately be used by a late-bound language like IronPython, I need to copy default parameter values from the interface onto the generated implementation. SyntaxGenerator.TypedConstantExpression contains a lot of very useful code here, which I really don't want to re-implement (and indeed doing so is really quite hard, as it makes use of a lot of internal code).

olivier-spinelli commented 4 years ago

I've been generated source code for a while now and I eventually get rif of "standard" AST or formal code representation in favor of a much more relaxed API that is really simple.

Absractions are available here: https://github.com/Invenietis/CK-CodeGen/tree/master/CK.CodeGen.Abstractions (100% comment covered.)

Key points are:

Using this is both easy and powerful.

chsienki commented 4 years ago

This is an interesting idea, we're currently still exploring how best to enable generators to actually create the source. SyntaxFactory could be an interesting option.

jmarolf commented 4 years ago

To clarify for folks, I believe SyntaxFactory should be useable today in source generators. Would be very curious if there was something there that didn't just work.

SyntaxGenerator lives in the workspace layer and would need to be ported down to the compiler layer.

canton7 commented 4 years ago

Purely from a user's perspective, it seems that since we're generating code, having the ability to use the tools already in Microsoft.CodeAnalysis.Editing would, on the face of it, make sense. I understand that reality often gets in the way though...

Even if the SourceGeneratorContext doesn't provide a SyntaxGenerator, then at least providing an overload of SyntaxGenerator.GetGenerator that we can call would be great, given the objects available on SourceGeneratorContext. (From browsing the source, I can't see a reason why SyntaxGenerator.GetGenerator needs anything more than the language, which is available on the Compilation, but it's very likely that I'm missing something).

There's loads of internal goodness for generating SyntaxNodes in Roslyn, and SyntaxGenerator is the gateway to it for third-party analyzers and source generators.

CyrusNajmabadi commented 4 years ago

Note: SyntaxGenerator has a lot of problems, and isn't even used for a lot of our actual IDE features (we have another entirely internal service we use for that purpose). If this becomes part of the compiler API we need to do a lot to make it sufficient for that purpose.

CyrusNajmabadi commented 4 years ago

From browsing the source, I can't see a reason why SyntaxGenerator.GetGenerator needs anything more than the language, which is available on the Compilation, but it's very likely that I'm missing something.

It's unfortunately not that simple. The internal impl depends on a lot on pieces of the Roslyn Workspace API. For example, it can tie into Workspace Formatting/Simplification/Using-Adding/Code-Cleanup/etc.

That's kinda hte idea of SyntaxGenerator. You want a simple way to create syntax that looks right. i.e. it generates code like:

Console.WriteLine(foo.Where(x => x >= 21))

Not code like:

global::System.Console.Writeline(
  global::System.Linq.Enumerable.Where<global::UserNamespace.User>(
    (global::System.Collection.Generic.IEnumerable<UserNamespace.User>)foo,
    (global::System.Func<global::UserNamespace.User, global::System.Int32>)(
        (global::System.Int32 x) => x >= 21))));

The latter is what you get when you don't have all those IDE services that properly make the code look like what you would want.

olivier-spinelli commented 4 years ago

Thank you @CyrusNajmabadi for this excellent example! This is exactly the kind of trouble that made me say goodbye to the AST approach and investigating a string-based API that may seem naïve but really effective API.

CyrusNajmabadi commented 4 years ago

AST's can work fine :) We use them entirely in the IDE. They certainly give the richest experience (where all the right intuitive stuff happens from the user's perspective). The main issue here is that 'generation' of the AST is not sufficient on its own if you want the experience to be good.

olivier-spinelli commented 4 years ago

Totally agree! But this comes at a cost.

Ask a developper to write a code that "produces a Main that says Hello to the World!".

1 - With a string based approach. 2 - With an AST.

Then ask him to say Hello to the first argument of the Main(). The string is soooo easy. And so safe! The goal is to produce source code: it will be parsed and compiled. Any (stupid) error will be caught early. No pain. No risk.

(That's just my thoughts. And my experience.)

CyrusNajmabadi commented 4 years ago

The string is soooo easy.

But it often isn't. Say you spit out Console.WriteLine("hello world"). Does that work? What if there's no using System;? What if there's no using and this compiles because Console bound to some type in the user project?

string-building works for very trivial cases, but is very non-resilient to real-world issues that arise when you want your syntax generator to be usable in the wide variety of real world projects you will face out there :-/

olivier-spinelli commented 4 years ago

I thought like you... before I confrontated this to the.... reality.

If Console.WriteLine("hello world") misses its "System", Then either: 1 - Decide that codeWriter.EnsureUsing( "System" ) is fine and will not lead to any ambiguity for others. 2 - Or take no risk and write "System.Console.Write" instead of "Console.Write".

I know it may seem naïve, weak, unsafe, but it appears to eventually be effective, easy and safe (one reason being that, once again, the whole output is fully parsed and has to be 'correct' - if it's not, rewrite! -, another is the resiliency to target language evolutions).

My point is that for "real" real-world code (close to the final application), this is gloablly better than an AST approach. I won't try to convince you more than that, don't worry. It's just my experience that I wanted to share .

CyrusNajmabadi commented 4 years ago

Or take no risk and write "System.Console.Write" instead of "Console.Write".

That's still risky. "System" can easily bind to a different source symbol in the context of the user project :)

will not lead to any ambiguity for others.

Very hard to determine that :)

My point is that for "real" real-world code (close to the final application), this is gloablly better than an AST approach

I think my point is that it's the opposite. It may be fine in some domains. However, for libraries it's really not ok since you can't assume that all your clients will be safe with how you're generating this stuff.

olivier-spinelli commented 4 years ago

"System" can easily bind to a different source symbol in the context of the user project

Ouch! Good luck with that! ;-)

We somehow agree:

Let's code at the right place!

cezarypiatek commented 3 years ago

Hi, are there any plans for solving this issue? Any recommended workaround in the meantime?

CyrusNajmabadi commented 3 years ago

There is no plan I'm there short term to do anything here in this space.

sharwell commented 3 years ago

@cezarypiatek The current recommended workaround is to use one of the following approaches:

  1. Implement your source generator using some sort of string building or text templating strategy (not really the target goal, but happens to be common when a source generator is created by porting some existing build tooling to the new APIs)
  2. Use SyntaxFactory, either directly or by creating helper methods for common cases within it
cezarypiatek commented 3 years ago

@sharwell thanks for your reply. I have an existing code generator https://github.com/cezarypiatek/MappingGenerator that uses heavily SyntaxGenerator and I was counting to migrate it easily to this new Roslyn API by wrapping it in SourceGenerator. The lack of SyntaxGenerator is a real blocker for me. Replacing it with SyntaxFactory requires a lot of work and probably the outcome will not be as readable as it is now, Just like @CyrusNajmabadi presented in one of the examples in this discussion. It's a little bit ironic that SourceGenerator API is missing SyntaxGenerator.