dotnet / runtimelab

This repo is for experimentation and exploring new ideas that may or may not make it into the main dotnet/runtime repo.
MIT License
1.37k stars 189 forks source link

Add SyntaxDynamo project #2526

Closed kotlarmilos closed 4 months ago

kotlarmilos commented 4 months ago

Description

The SyntaxDynamo project provides an API for generating C# source code. The code is modular and stateless. Unit tests verify the syntax of the generated code.

stephen-hawley commented 4 months ago

Dynamo was a cute name that I picked for this assembly. It's because historically a dynamo was touted as a "better (electrical) generator". Since I knew from the start that BTfS was going to need a lot of code generation, I asked what Xamarin used in Sharpie and was told "Console.WriteLine". I decided that I needed a "better (code) generator" hence Dynamo. We don't need this name. I'd be perfectly happy for it to be named "CodeGeneration" or "SourceGeneration" or "SourceCodeGeneration" - something that describes exactly what it is.

In browsing the code, there are a number of idioms that we can also improve because of changes in C# since Dynamo started. Many of the concrete types feature between 4 and 6 properties that are all declared "get/private set" in and initialized in the constructor. Instead, these properties can be declare { get; private set; } = initializer which will clean up the code quite a bit.

Neither are these are particularly crucial, but it will make things better as we move forward.

Also, you should merge in changes from the current BTfS since in the last few PR's I added a number of convenience methods to make it easier to add/query optional types, pointer types, explicit unwrapping, etc.

AaronRobinsonMSFT commented 4 months ago

The SyntaxDynamo project provides an API for generating C# source code.

Why not the Roslyn APIs? They are wordy for sure so I get not wanting to use them. Other source generators take another approach and emit source directly, similar to what is here, is this something they can use? If not, do we need 5k+ loc for this sort of infrastructure?

/cc @stephentoub

stephentoub commented 4 months ago

I agree with @AaronRobinsonMSFT. The recommended approach is to just write out the source directly, typically with a StringBuilder or StringWriter or IndentedTextWriter or something similar. You can also use the Roslyn APIs directly. But we shouldn't be creating an entirely new object model for writing out C#: if an object model is desired, it should use Roslyn's, otherwise, it should just write out the text directly.

stephen-hawley commented 4 months ago

Can Roslyn's source generators generate syntactically correct Swift? If not then using roslyn means that we still have to have this code and code generation is done with two completely different idioms.

Also using Console.WriteLine for writing code is a non-starter. BTfS generates a LOT of code and using an object model for doing it has eliminated vast swathes of errors by making them impossible from the get-go and makes it easier to write fairly complex correct code.

stephentoub commented 4 months ago

Can Roslyn's source generators generate syntactically correct Swift?

I'm not following... the description of the PR states "The SyntaxDynamo project provides an API for generating C# source code"... is it generating C# source code, Swift source code, or both?

stephentoub commented 4 months ago

Also using Console.WriteLine for writing code is a non-starter. BTfS generates a LOT of code and using an object model for doing it has eliminated vast swathes of errors by making them impossible from the get-go and makes it easier to write fairly complex correct code.

Not Console.WriteLine, to be clear. And, for reference, most of our source generators can emit a lot of code and haven't had trouble doing so correctly by just writing out the source; in many situations the code for the generator is cleaner, simpler, and significantly faster emitting the source directly rather than using an object model to do so.

Note that the interop source generator was on an object model plan, but is switching over to directly emitting C# source. https://github.com/dotnet/runtime/issues/95882

kotlarmilos commented 4 months ago

I'm not following... the description of the PR states "The SyntaxDynamo project provides an API for generating C# source code"... is it generating C# source code, Swift source code, or both?

At this stage of the project, the goal is to reduce complexity of bindings and generate C# code only. Therefore, the SyntaxDynamo project does not include support for Swift source code generation. However, there may be a subset of projections that cannot be projected without Swift wrappers.

One approach is to ask users to write "adapters" for such scenarios. Another approach is to improve the tooling to generate Swift wrappers, similar to the functionality in the Binding Tools for Swift (BTfS).

AaronRobinsonMSFT commented 4 months ago

One approach is to ask users to write "adapters" for such scenarios.

Until we see a need for more, I would go with this approach. Generating Swift source means we need to do compiler discovery and have an implicit tool chain dependency. This support may be warranted at some point but for our planned v1, I don't see the need for the complexity.

stephen-hawley commented 4 months ago

I'm not following... the description of the PR states "The SyntaxDynamo project provides an API for generating C# source code"... is it generating C# source code, Swift source code, or both?

At this stage of the project, the goal is to reduce complexity of bindings and generate C# code only. Therefore, the SyntaxDynamo project does not include support for Swift source code generation. However, there may be a subset of projections that cannot be projected without Swift wrappers.

One approach is to ask users to write "adapters" for such scenarios. Another approach is to improve the tooling to generate Swift wrappers, similar to the functionality in the Binding Tools for Swift (BTfS).

I don't think we're seeing eye-to-eye here and I want to fully understand what you're saying. Can you explain how a user would write a class in C# that will implement a swift protocol in such a way that the object can consumed by swift?

kotlarmilos commented 4 months ago

I don't think we're seeing eye-to-eye here and I want to fully understand what you're saying. Can you explain how a user would write a class in C# that will implement a swift protocol in such a way that the object can consumed by swift?

You're correct that there are certain cases where it's necessary to generate Swift wrappers using the tool or by the user. It's crucial to consider all scenarios when making design decisions, which is exactly what you're doing. However, if our goal is to ship MVP in .NET 9, it's important to take small, incremental steps and adopt a lean approach. I don't believe we need to address all cases in the initial implementation.

Moreover, when we look at other cross-platform frameworks, many of them rely on ObjC compatibility, which doesn't support features like generics and associated types in Swift. We need to acknowledge that we may not be able to resolve certain scenarios and make trade-offs. This doesn't mean that at a certain stage we won't introduce it. At this stage, we have to ensure that we don't make wrong design decisions and lock out some features.

kotlarmilos commented 4 months ago

I agree with @AaronRobinsonMSFT. The recommended approach is to just write out the source directly, typically with a StringBuilder or StringWriter or IndentedTextWriter or something similar. You can also use the Roslyn APIs directly. But we shouldn't be creating an entirely new object model for writing out C#: if an object model is desired, it should use Roslyn's, otherwise, it should just write out the text directly.

We're discussing two distinct concepts here: code generation and code emitting. Code generation needs to be non-linear, even for simple P/Invoke scenarios where inserting a new using statement may be necessary mid-process. An object model like the one in SyntaxDynamo appears to be a good fit for this purpose. Code emitting can be handled separately using the Roslyn API, or even better, using a string-based approach.

Let's review the requirements. We want to keep the possibility of writing Swift code open, although it won't be included at this stage. By leveraging SyntaxDynamo and implementing a set of interfaces for C# code emitting, we can gain a strong representation of the bindings separated from code emitting.

We postpone decision-making until a later stage when we have more knowledge about use cases and allow for independent implementation/maintenance of code generation and code emitting.

stephentoub commented 4 months ago

I do not want yet another sprawling custom object model representing C# coming into dotnet/runtime (we already have one, btw, in System.CodeDom, separate from the official, maintained one that Roslyn has). If you want to experiment with it in dotnet/runtimelab, I don't have huge objections, but I would expect to object to any attempt to then migrate that over for real.

kotlarmilos commented 4 months ago

How is this accomplished in other projects that generate C# source code? What, in your opinion, would be a good pattern to follow? I remember that @jkoritzinsky was talking about a shared interop source generators.

stephentoub commented 4 months ago

How is this accomplished in other projects that generate C# source code?

The regex source generator writes out C# directly: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs

The JSON source generator writes out C# directly: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Text.Json/gen/JsonSourceGenerator.Emitter.cs

The logging source generator writes out C# directly: https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Logging.Abstractions/gen/LoggerMessageGenerator.Emitter.cs

The options source generator writes out C# directly: https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Options/gen/Emitter.cs

The config source generator writes out C# directly: https://github.com/dotnet/runtime/blob/main/src/libraries/Microsoft.Extensions.Configuration.Binder/gen/ConfigurationBindingGenerator.Emitter.cs

etc

kotlarmilos commented 4 months ago

Thanks! Let's frame this. Both SyntaxFactory and CodeDom are object model frameworks that can represent C#. As the projection tooling would benefit from an object model, we can use one of them in combination with an "indented text writer" to gain speed and memory footprint improvements. Is that correct?

If so, then SyntaxDynamo can be considered as an object model framework for representing both C# and Swift. In this case, if we decide to proceed with an built-in object model framework (i.e. SyntaxFactory or CodeDom), we'll need to maintain a separate one for Swift if required, since there isn't a corresponding implementation for C#.

With that in mind, I believe it's important to use any object model framework to support non-linear code generation and to avoid building the tooling from scratch. Maintaining different versions of object model frameworks for C# and Swift presents a tradeoff, but it's not necessarily a poor design choice.

jkoritzinsky commented 4 months ago

How is this accomplished in other projects that generate C# source code?

For the interop source generators, we currently use the Roslyn object model, but we plan to move to a very thin object model (statement, expression, name, type) that's backed by a strings directly due to performance. The Microsoft.Interop.SourceGeneration library provides our shared logic.

I don't think we're seeing eye-to-eye here and I want to fully understand what you're saying. Can you explain how a user would write a class in C# that will implement a swift protocol in such a way that the object can consumed by swift?

The Swift metadata layout is specified in the Swift spec. Our generation tooling should generate a struct that represents this metadata in memory (type metadata, witness tables, and the like) that match the Swift standard that Swift can consume. That way, we don't need to generate any Swift code as we can provide what Swift expects without using Swift.

kotlarmilos commented 4 months ago

Thank you all for providing valuable feedback. I've implemented the feedback in the https://github.com/dotnet/runtimelab/pull/2525. Please note that this is an MVP based on BTfS but with simplifications (parser and emitter).

The main reason of introducing simplifications is to try to achieve CryptoKit dev templates in this release cycle. Incorporating comprehensive components such as a swiftInterface parser or object-model emitters would require more time and should be accompanied by feedback from early customers to effectively design and review them.

Please streamline the feedback to the mentioned PR.