dotnet / fsharp

The F# compiler, F# core library, F# language service, and F# tooling integration for Visual Studio
https://dotnet.microsoft.com/languages/fsharp
MIT License
3.87k stars 779 forks source link

Tooling :: Interop story for .NET libraries using C# source generators for high performance #14300

Open T-Gro opened 1 year ago

T-Gro commented 1 year ago

I want to open the discussion on consumption of .NET libraries built using C# source gen support https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/source-generators-overview .

With every .NET version, more APIs use it and is part of the reason of the unprecedented performance boosts of .NET platform for various computing tasks. A few examples include:

And I think it is only about time before database libraries or structured loggers (incl. telemetry) will have it as well.

I do not believe that F# needs it's own clone of source generators, there are concepts like type providers or Myriad which allow accomplishing similar goals. Therefore I did not continue this as a comment on https://github.com/fsharp/fslang-suggestions/issues/864 , which I believe has different aspirations.

I do want to reopen the discussion from the .NET library consumption perspective, especially around libraries/frameworks that are massively backed and invested in (like aspnetcore), and are expensive to replicate and part of the big performance wins. F# is " | Succinct, robust and performant language for .NET" and the ability to consume the fastest of .NET's libraries IMO comes with this motto.

The latest resolution I could find on the topic is "use C# project" https://github.com/fsharp/fslang-suggestions/issues/864#issuecomment-903267638 , which makes sense in the short term (< 3 years) perspective.

However, the broader the usage of C# code gen in well established libraries, the more slices would have to be done in a project to separate the F# pieces (where F# programmers wants to write) and C# parts (simply due to libraries needing that). Important to note that it might spread multiple layers of the application, and therefore isn't just "1 F# and 1C# project", but rather an interleaved sandwich depending on the level of the stack a library is targeting. The cognitive complexity of seeing a project (which turned into a solution by now) like this is objectively bigger and the typical display by Solution Explorer does not make the dependency order visible at a first glance.

Which brings me to the tooling topic - what can we do better in order to support a smooth workflow using such libraries in a project that would want to be F#-only otherwise.

There is an older suggestion about mixed projects https://github.com/fsharp/fslang-suggestions/issues/117 , which was correctly resolved as being a tooling issue and not a change to F# language itself.

My current view is that the user-facing side of this feature could look like embedding a single standalone C# file into middle of a F# project. That C# file would have access to all project/package dependencies (this is where the source gen stuff is), F# files before it, NOT the files after it ;; and would be only accessible by F# files coming after it. If this eliminates any worries, I think this would be handy even if always restricted to 1-C#-file scenarios only.

From the IDE side, I could imagine this being a "lightweight project within project", as well as a .cs file being within .fsproj and the F# compiler knowing to split the project into multiple compilation units, invoke Roslyn underneath and putting the results together in the right order.

I will wait for someone more knowledgeable to assess if merging the produced C# & F# ILs together is even a theoretical thought, or it if this would have to be independent .dlls on the output.

This is an XXL item

KathleenDollard commented 1 year ago

This is a really cool idea.

I wonder if a first step in this could be to force the .cs files to be first, but that is largely due to a specific concern, so I will explain it.

Due to how Roslyn source gen works, you cannot separate source gen from compilation. Because of this, I believe we will have to hydrate a C# project, and that in VS that would have to be hydrated in the workspace and remain part of the design time build for the project. From the VS perspective, I think that the C# project would have to appear real. Happy to have those wiser on VS correct me.

Project and package dependencies are giving me a bit of a headache right now, so let's just assume that is solvable.

Since F# is order dependent in the project file, I do not have my head around a way to do this with partial dependencies - the ones that would be available part way through the F# compilation. That is why I suggested that at least for a first cut, we have the C# dependencies isolated. If that makes the feature worthless, let's know that up front.

The alternative, and maybe this is what you have in mind, is that the F# compiler would maintain a separate C# transient project for every set of non-sequential set of .cs files. I do not see how this could work because the C# project would view the F# project via project dependencies and a) that would be a circular ref which is not supported and b) what F# project is available halfway through a F# compilation?

I look forward to feedback on this!

Adding @chsienki in hopes he can look at this question from a different perspective.

PS. In the examples listed, some seem relatively timeless such that we could do work to provide the same code in a different way with a potentially different gesture in F# - JSON and RegEx. Probably gRPC is pretty stable, but there are many areas that are not. I agree that if we do nothing F# will be disadvantaged in these scenarios and am interested in how important getting the performance in the C# way is to people. Could we work with the community to find more F# answers for the scenarios that matter, learning from the work in C#. Happily, source generators make understanding what C# is doing to gain performance easy to understand (or as easy as possible in the case of RegEx ;-)

vzarytovskii commented 1 year ago

One problem is that many C# code generators rely on partial assemblies/types, which means we either have to:

That said, I think we should take TOP libraries which use source gen and see their use-cases, and what's needed from us to support those.

T-Gro commented 1 year ago

This is a really cool idea.

I wonder if a first step in this could be to force the .cs files to be first, but that is largely due to a specific concern, so I will explain it.

Due to how Roslyn source gen works, you cannot separate source gen from compilation. Because of this, I believe we will have to hydrate a C# project, and that in VS that would have to be hydrated in the workspace and remain part of the design time build for the project. From the VS perspective, I think that the C# project would have to appear real. Happy to have those wiser on VS correct me.

Project and package dependencies are giving me a bit of a headache right now, so let's just assume that is solvable.

Since F# is order dependent in the project file, I do not have my head around a way to do this with partial dependencies - the ones that would be available part way through the F# compilation. That is why I suggested that at least for a first cut, we have the C# dependencies isolated. If that makes the feature worthless, let's know that up front.

The alternative, and maybe this is what you have in mind, is that the F# compiler would maintain a separate C# transient project for every set of non-sequential set of .cs files. I do not see how this could work because the C# project would view the F# project via project dependencies and a) that would be a circular ref which is not supported and b) what F# project is available halfway through a F# compilation?

I look forward to feedback on this!

Adding @chsienki in hopes he can look at this question from a different perspective.

PS. In the examples listed, some seem relatively timeless such that we could do work to provide the same code in a different way with a potentially different gesture in F# - JSON and RegEx. Probably gRPC is pretty stable, but there are many areas that are not. I agree that if we do nothing F# will be disadvantaged in these scenarios and am interested in how important getting the performance in the C# way is to people. Could we work with the community to find more F# answers for the scenarios that matter, learning from the work in C#. Happily, source generators make understanding what C# is doing to gain performance easy to understand (or as easy as possible in the case of RegEx ;-)

Indeed, the solution I had in mind was spliting the user-visible project and doing a separate compilation unit for each block, treating change of languge as a switch into a new unit. So in this case, there would be 5 (!) compilation units, each having reference to it's predecessors as being separate assemblies. That would also mean that in context of an F# project, the .cs files would NOT see each other bidirectionally, and the visibility would follow the project order as it does with F# files.

After those 5 separate compilation units are done, it would be of course good to put them back together into a single .dll. If that is doable, I do not know. (e.g. if a .dll created this way and containing output from two different compilers could create issues somewhere down the road when consumed)

It might look crazy to do 5 different compilation units, but in the end this is what users do today when separating those into projects manually.

image

En3Tho commented 1 year ago

One of the ways is maybe trying to embed a C# code piece to F# Many simple but useful things like LibraryImportGenerator or RegexGenerator only use single partial method and an attribute to flag source generation. I guess they can be a goal for a start?

F# code ...
```csharp // like an md for example
public class FastRegex
{
    [RegexGenerator("WowF#")]
    public partial Regex MyCoolRegex();
}
``` //

if FastRegex.MyCoolRegex.IsMatch(...) then

Props:

  1. It sorta has a natural bit to F# in a sense that code above won't know about MyCoolRegex and code below will (at least this is the idea).
  2. You don't need to make a dedicated file for this.

Cons:

  1. Looks out of the place.
  2. All the ceremony with files is still there - need to think how to extract this code bit to a dedicated file, pass it to roslyn, import back, place breakpoint etc and also how to restrict accessability
  3. Need strict rules about where such code can be placed (I guess inside namespace only or inside a module but namespace feels easier to do)

One of the options is trying to revive F# -> Roslyn interop. But as @vzarytovskii stated F# needs to have a support for "partial" at least.

With new "file" modifier I belive some of the complexity is gone because generators do not need to scan assemblies for similar type names, resolving conflicts etc. This might be easier to do now.

Pros:

  1. Do not need .cs files at all (at this stage at least), feels very natural to F#

Cons:

  1. Need to create both export and import to Roslyn / from Roslyn: export F# AST => C# AST for SG, wait for SG, import C# AST => F# AST (virtually or in other ways)
type Regex with
    [<RegexGenerator("WowF#")>]
    static member MyCoolRegex() = partial // keyword?

if Regex.MyCoolRegex().IsMatch(...) ...

The main idea behind thise ideas is trying to make generated stuff visible to code just right below it. To not introduce a "hard" split in code a logic. I belive this might be one of the hardest things?

vzarytovskii commented 1 year ago

One of the options is trying to revive F# -> Roslyn interop.

This would be an extremely fragile solution and will require constant changes adapting to all roslyn changes.

dsyme commented 1 year ago

This is a big topic, and I like your framing @T-Gro. Above all it's very important to approach anything in this space from the perspective of "how are we going to implement this", including in the IDE. Anything here requires very deep changes to how compilation and analysis proceed and needs very close attention to detail.

On the whole I'm going to stay out of this directly - it's important, but not my battle :) I'll jot a few notes which might be useful.

One advantage of using an extensibility point is that the code generator and Roslyn compilation would be held "at arm's length", hosted in the TP. Further you could version that component separately. In principle you could alternatively design add a different extensibility point that achieved a similar thing. I've got a feeling it would look a lot like generative TPs.

Anyway, on the whole I'd recommend having a good think about factoring things this way. That is, via an extension point, rather than direct integration. Maybe F#-for-.NET would then come with a RoslynSourceGenerator TP thing with all the build logic automagically hooked up. Maybe not. But decoupling may be very valuable here.

If you did go down the route of extending the existing TP mechanism, other good things could potentially drop out, e.g.

Some general comments - I personally think F#'s future existence is firmly rooted in being both a Javascript language and .NET language - and we should assess everything we do from this perspective. We must also focus on F#'s own existence as its own set of libraries and ecosystem, rather than always being downstream from .NET change and churn - most of which is now frankly treating .NET as a single-language ecosystem.

To put that in perspective, in the past 90% of our efforts have been to interoperate with .NET assets. While that's been great for properly-designed truly cross-language core libraries, it's often not turned out to be very fruitful for anything that involves complex compilation (e.g. IDE tooling using code generation, likewise database and service generators). We can burn a lot of time and energy to interoperate with these libraries, and doing so can suck us into very deep dependencies on C# both technically and culturally. So I recommend looking for an approach to this that is fundamentally F#-first, where what you want drops out as an instance of a more generic capability.

En3Tho commented 1 year ago

One of the problems I recently hit when trying to make a thin (as it could possibly be) wrapper around Blazor is that it's currently impossible to inline ast/make type partial. There is a Myriad and it's a good tool I guess but it suffers from this limitation too. I can actually imagine partial modules. It should be a thing with least amount of limitations and obstacles. Not sure about partial types tho. @dsyme can you please share if you ever given a though about partial modules/types?

To illustrate the situation: Consider we have a component like this:

type HelloWorldFSharp() =
    inherit ComponentBase()

    [<Parameter; EditorRequired>]
    member val Name = "" with get, set

    [<Parameter>]
    member val Name2 = "F#" with get, set

    override this.BuildRenderTree(builder) =
         builder.Render(blazor {
             h1 {
                 $"Hello, {this.Name} from {this.Name2}!"
             }
        })

The obstacle is that Name and Name2 are set via RenderTreeBuilder meaning not just directly Name = ... and Name2 = ... So I've decided that codegen is the best thing I can do here:

[<AutoOpen>]
module HelloWorldFSharp__Import =
    open FSharpComponents
    open System

    type [<Struct; IsReadOnly>] HelloWorldFSharp__Import(builder: BlazorBuilderCore) =

        member this.Name2 with set(value: String) =
            builder.AddAttribute("Name2", value)

        interface IComponentImport with
            member _.Builder = builder

    type HelloWorldFSharp with
        static member inline Render(builder: BlazorBuilderCore, name: String) =
            builder.OpenComponent<HelloWorldFSharp>()
            builder.AddAttribute("Name", name)
            HelloWorldFSharp__Import(builder)

And then import:

type Importer() =
    inherit ComponentBase()
    override this.BuildRenderTree(builder) =
        builder.Render(blazor {
            fun b -> HelloWorldFSharp.Render(b, "C#", Name2 = "VB")
        })

The problem is that this generated import and Render extension should live right below HelloWorldFSharp type. Currently it is impossible unless you write this thingy by hand.