fsharp / fslang-suggestions

The place to make suggestions, discuss and vote on F# language and core library features
342 stars 20 forks source link

Support Source Generators #864

Open praeclarum opened 4 years ago

praeclarum commented 4 years ago

Support Source Generators

Add support similar to C# Source Generators

The idea is to execute the compiler in two passes:

  1. Pass 1 Parse and type check the project code (the type check may be optional as it will contain errors)
  2. Send that information to Source Generators that output new code files or syntax trees
  3. Pass 2 Combine all code, type check, emit

The existing ways of approaching this problem in F# are:

  1. TypeProviders which take specialized knowledge to author.
  2. Custom build steps to emit code

Pros and Cons

The advantages of making this adjustment to F# are an easy form of meta programming. It's basically all the benefits of type providers without the complexity.

The disadvantages are the repetition of a feature and the compiler performance penalty of executing the type checker twice when this feature is used.

Extra information

Estimated cost (XS, S, M, L, XL, XXL): M (depending on what data is passed to the generators)

Affidavit (please submit!)

Please tick this by placing a cross in the box:

Please tick all that apply:

7sharp9 commented 3 years ago

@voronoipotato If you dig around you could find the intrinsic branch and resurrect it, maybe it could be adapted to allow intrinsics within an assembly like I mentioned, that would be sweet.

[♠️ ]https://github.com/dotnet/fsharp/pull/882/files

dsyme commented 3 years ago

Now that the first version of C# Source Generators has shipped, some of the .NET ecosystem can start adopting them, which is great.

I would find it helpful if people started to link examples where

  1. they encounter C# source generators in practice in nuget packages they wish to use
  2. where there's really no good easy replacement for the technique
  3. where they think the use of source generators will "stand the test of time" and is not just use-the-latest-C#-feature cargo cult
baronfel commented 3 years ago

The two concrete use cases that immediately jump out to me are:

Ciantic commented 3 years ago

Given that most of Rust ecosystem is knitted together by macros (which are of similar value), calling it a cargo cult is a bit insulting. (Also clever because that's how I use most of my source generation cargo run!).

7sharp9 commented 3 years ago

Source generators are useful for filling in holes around boilerplate code that you can encounter and don’t want to wait for a batteries included FSharp core. They are really good in dsl and general helpers. I developed Myriad to help fill in these areas, it’s much easier to generate an ast fragment for boilerplate than to go though and RFC and wait for a new version of F#. I’m not saying that isn’t important, but fast turn round in a project is very important. By fleshing out Myriad further I’m hoping it will be useful in this scenario.

On Tue, 12 Jan 2021 at 20:52, Onur Gumus notifications@github.com wrote:

@dsyme https://github.com/dsyme, I believe the freedom of speech and ideas are questionable in this thread in particular when one uses the word "macro", So I wouldn't keep my hopes high to get useful feedback.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fsharp/fslang-suggestions/issues/864#issuecomment-758969586, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEPXSVH67HEALVFDSBAQ5TSZSY7NANCNFSM4MUIJBIQ .

mrange commented 3 years ago

Lenses, for example, has quite large amount of boilerplate code.

I realize this is an unpopular opinion but if F# had T4 support + some kind of partial classes/methods I be happy as a clam.

I realize I have type providers in F# but it doesn't support the workflow I like to have. Code generation in the same project as I am working in, short iteration loops from an idea to actual generated code, having the generated code as "normal code" rather than expression trees.

PS. Listening in on quite a few twitch streams when they talk source generators they never mention T4. I think T4 is considered a bad word, like monads, and mustn't be mentioned to not scare away folks :-)

7sharp9 commented 3 years ago

I might be doing some streams on F# with myriad soon. Incidentally Myriad has record lens generation built in.

Sent from my iPhone

On 13 Jan 2021, at 15:02, mrange notifications@github.com wrote:

 Lenses, for example, has quite large amount of boilerplate code.

I realize this is an unpopular opinion but if F# had T4 support + some kind of partial classes/methods support I be happy as a clam.

I realize I have type providers in F# but it doesn't support the workflow I like to have. Code generation in the same project as I am working in, short iteration loops from an idea to actual generated code, having the generated code as "normal code" rather than expression trees.

PS. Listening in on quite a few twitch streams when they talk source generators they never mention T4. I think T4 is considered a bad word, like monads, and mustn't be mentioned to not scare away folks :-)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

baronfel commented 3 years ago

There are issues in prominent projects (Like PInvoke.Net) where the authors are considering moving to a primarily Source Generator-driven workflow, which is concerning because those libraries then become inaccessible to F#.

AraHaan commented 3 years ago

I agree, source generating p/invoke calls (like MiniDumpWriteDump for example) is better than manually writing the code and getting it all wrong, especially when you need to maintain an MINIDUMP_TYPE enum manually and it's far better to just source generate that enum instead because you never known when a newer version of dbghelp adds more values to that enum and then breaks your mini dumping with error codes and stuff, or if they ever decided to change the value of some flag on the windows sdk, however your c# code does not change said value and it causes conflicts that can be avoided with source generating it all instead.

However I do have a nit about source generating MiniDumpWriteDump at the moment and that is because:

davidglassborow commented 3 years ago

A good discussion on source generators from the author of Dapper https://blog.marcgravell.com/2021/05/is-era-of-reflection-heavy-c-libraries.html

pbiggar commented 2 years ago

I've been moving to .NET6. One of the major use cases of .NET6 for me is support for AOT compilation of Blazor to WebAssembly. As I understand it, this is currently blocked on F# lacking source generators for System.Text.Json, as reflection cannot be used with AOT compilation of Blazor.

I feel that even if F# does not get source generators, there needs to be some interop story for using .NET apis which use source generators.

AraHaan commented 2 years ago

I agree it would suck to require reflection in F# but at the same time for C# and VB to recommend moving away from it due to Source Generators providing a performance improvement due to using source generation to replace the need for reflection.

After which time I feel like everything with reflection other than checking of a type inherits from another type within a specific assembly and System.Reflection.Assembly should be deprecated. But even then I think that can be done without reflection too.

dsyme commented 2 years ago

@pbiggar To be honest I've historically been sceptical of attempts to make .NET "AOT static" and especially when this is placed on the critical path for delivery. There are just so many .NET practices that require JIT code generation and/or a modicum of reflection - LINQ, generic virtual methods, a bunch of reflection things, some things in FSharp.Core. I've seen related agendas for .NET static compilation fail to deliver several times - delivering buggy, incomplete versions of things that look like .NET but aren't. In my eyes these shouldn't even be allowed to use the name .NET unless they actually implement the .NET standards (or minimal-expectations) including reflection.

That said, I understand the .NET team are taking another stab at this, I'm not really following the technical details though. But the number of F# scenarios where static compilation to WebAssembly without any runtime codegen or reflection is critical seems small (and the number of C# scenarios where programmers don't end up using some technique or another requiring reflection is also small).

That is, I don't currently see anyone across the F# world signing up to deliver this goal of end-to-end JIT-free, reflection-free execution of real-world F# code using rich libraries - both because it's difficult and and rarely necessary. The AOT compiler you're referring to does give some of this - and it's been available on Mono for a while including iOS execution with reflection - but adding the requirement to remove all libraries that use reflection is too onerous I think.

So in this case I just don't think I'd encoourage F# users to sign up to that pat of the .NET 6 story - or if you do the responsibility is to understand that there are likely to be blocking points like the one you mention along the way, which require either C# code, or using a different reflection-free Json library.

dsyme commented 2 years ago

there needs to be some interop story for using .NET apis which use source generators.

For the forseeable future that story will be "add a C# project and generate the source there".

charlesroddie commented 2 years ago

@dsyme There are very good reasons for the aot and linker friendliness plans in dotnet, so it's not reasonable to insist on the original spec without considering the benefits and costs of the changes.

To deploy apps that run on end-user systems you want some form of AOT to avoid slow startup and JIT-lags and to avoid deploying code that is trivial to decompile. MonoAOT (for ios/android/wasm) is tolerant of reflection, but there is a lot of work being done on linking which could change this. NativeAOT is coming with much better size/performance characteristics, and this is less tolerant (with an optional reflection-free mode for minimum size).

Reflection is overpowered, giving the ability to break any number of constraints. A lot of typical reasoning about F# code, particularly about types, has the qualification "in the absence of reflection". Therefore removing reflection creates a stronger form of F#, a safer version where constraints expressed by types are respected. Reflection usage outside of compilers is a code smell, and investigation into F# incompatibilities with corert/netnative picked out parts of F# that are badly written.

These two aspects go together: the reason that reflection is performant-AOT-unfriendly and linker-unfriendly is that it limits the ability to reason about code.

Overall the aot/linker moves (of which source generators are a part) are creating a superior form of dotnet, with performance benefits to end users but also enforcement of code constraints. It is a wide-ranging attempt to do dotnet properly and F# code should be the most naturally compatible as it favours type safety and explicitness over "magic".

dsyme commented 2 years ago

so it's not reasonable to insist on the original spec without considering the benefits and costs of the changes.

I do insist on conformance to the estabilished specs for .NET - both in ECMA and certainly since .NET Core. .NET isn't a play thing for its engineers to make what they want it to be on each iteration, it's a set of de-facto and actual standards.

My fundamental belief is that .NET is simply not a statically compiled system (instead it's "majority statically compiled with occasional JIT or slow fallback-code") and anyone who thinks it can be is making a highly expensive category error. The .NET 1.0/2.0 addition of both reflection and generic-with-code-expansion-and-generic-virtual-methods (and the very widespread use of both of these) ensured that it never will be.

Now, maybe "this time it's different" but, to put it another way, the costs are indeed not being properly considered. The costs of breaking established specs are vast, staggering, colossal. Probably about a trillion dollars of IT investment sits on top of .NET in one way or another - much of it on .NET Core. For example, there is zero costing for modifying F# for this, let alone all its libraries, let alone the broader universe of .NET libraries (e.g. the cost of revamping FSharp.Core to be reflection-free, possibly breaking into pieces, annotating it... ) As an aside, while I'm no fan of reflection from a soundness perspective, the theoretical benefits of F# for this kind of work are too theoretical for me to find compelling, given there's no costing here.

To deploy apps that run on end-user systems you want some form of AOT ...

Sure, it's useful, but I still believe it's basically a pipe-dream to ever think you can have the majority of .NET coding be without JIT and reflection. It ain't going to happen. At best you end up with some niche tool for constrained-coding-app-on-device scenarios. However the programming model ends up so compromised (whether attributes, annotations, code-generation-tools, configuration files) beyond hello-world demonstrators that it's likely just some weird version of .NET that relatively few people know. At best, it bi-furcates the .NET ecosystem (as we saw with .NET for Silverlight, Windows Phone, .NET Native, whatever ....).

Anyway, I want to emphasise that JIT-free .NET execution hasn't featured in our planning for people employed directly to work on F# at Microsoft.

More positively - from the F# perspective I'd be really grateful if the community lead on this - for example, why not create a tooling RFC under https://github.com/fsharp/fslang-design/tree/main/tooling where you can together document and record the whole matter of F# and jit-free AOT? That would create a single point of reference. A comparable RFC is this one: https://github.com/fsharp/fslang-design/blob/main/tooling/FST-1033-analyzers.md

AraHaan commented 2 years ago

A comparable RFC is this one: https://github.com/fsharp/fslang-design/blob/main/tooling/FST-1033-analyzers.md 🤢 Why not change that design to instead eventually integrate all of the C# code analysis into Roslyn as Microsoft.CodeAnalysis.FSharp, and all the other things that then the F# compiler could itself use and then it could be less expensive to implement source generator support?

Also I disagree, source generator support should not break anything at all, only just gives a specific set of developers (myself included) another tool in our tool chest. Besides I do not think people wanted FSharp.Core to stop using reflection but rather used it as an example where some of it's reflection can be replaced better and be more performant than just using the reflection. I have put in place (even though I am an open source developer relying on my (currently 5$/month) donations to write and ship complete and open source .NET projects (all targeting .NET Standard 2.0) that can be used for everyone (even companies) just so they do not need to reinvent the wheel and end up doing it themselves which saves them time and money already (installing a nuget package ~10 seconds of their time nowdays). While I understand your argument that it can cost billions of dollars in costs, look at the open source developers like me who rely on donations and do not explicitly on our licenses say "To use this open source code you must donate at least x$ per month." just because we don't want to chase our user-bases away because they might be other people like me, making their own libraries that depends on some of my code to do their stuff which then they ship as a package (yet again) for others to consume as well.

At this point in time I think the most the "money" argument is just that, an argument to try to justify not giving someone a tool in their toolchest to optimize for performance. I remember when source generators did not exist at all for Visual Basic and C# and look at the performance tank back then, System.Text.Json was over 30x slower than it is now, ASP.NET Core was about 40x slower in some reflection spots, System.Windows.Forms had the same issues as well, WPF also (all written in C# with the exception of WPF that contains some C++ bits). My point is, if they had that argument like you guys claim when specing for C# and Visual Basic source generators on roslyn back then chances are Source Generators would have never existed there and then those improvements would have never been made, ASP.NET would have been at the same performance level as the one from before Source Generators became an actual thing. I argue that source generators are being used to actually augment code that can be used in the place of reflection in some situations (if not all of them).

Anyway, I want to emphasise that JIT-free .NET execution hasn't featured in our planning for people employed directly to work on F# at Microsoft.

While that may be true, not all .NET Developers (even in the @dotnet organization and maybe also in the @fsharp organization here on github) do not work for Microsoft. Not all of them thing that JIT-free .NET execution is a must for some targets (for example what if you want to use F# to make an iOS application for an older iPhone that explicitly bans any form of JIT in order to run any apps) While iOS 15 might allow it somewhat (unless you apply to get it into the Apple App Store as it's still banned there) Some companies still face what us non-company bound programmers face everyday, WHEN you want to actually ship to the Apple AppStore (that means you cant use reflection anywhere in the execution phase at runtime and no JITer must be invoked as well to be accepted) making your application denied and a nonrefundable loss of 100$ each time to try to push that application (written in F#) up to the AppStore for Apple to look at and then use it to fund the whole project. Some open source developers actually do that with the C# made projects that made it into the AppStore thanks to these few things:

After that is all said and done they eventually push it to the AppStore and apple is then happy and approves it. The same is done at the company level. (You think none of the iPhone apps in the AppStore is not made in C#, I know for like Git2Go that was on my iPhone 5s that used libgit2sharp so that tells you something, they used C# and I think it. Other git based phone applications are also made in C# say for example the Github app for iPhones I think is made in C# and uses libgit2sharp as well. And ironically @github is owned by @Microsoft.

I think that applications for iPhones for example should not be limited to just C#, or Visual Basic.NET, but also include F# but for real world applications, source generators is a must. Hello-world applications and foo-bar applications are just examples of valid real world applications, but they do not properly benefit being source generated at all just because they normally do not use any reflection anyway and so no gains are made at all if you do or do not use source generators there. Infact it might be why you guys are thinking that it's not worth it but it's not entirely true, besides on the runtime side (the runtime team for .NET which works at @microsoft) also think that any place that does not use reflection in the runtime is a critical performance increase that should be made because then all users of .NET would be happy for a mor performant .NET. Heck that was the main argument between .NET Core (back when .NET Core was not rebranded to .NET) and .NET Framework back then because .NET Core optimized .NET Framework apis further and with it made performance increases of .NET Framework 4.x and made a valid reason for non-company based programmers, and company based ones to jump into .NET Core and use that instead of .NET Framework (where performance is critical because you process a lot of data and need good performance to reduce CPU load while not reducing the rate of processing data so then everyone is happy). This is why everyone uses things like System.Memory for Span<T>, Memory<T> and of the like for performance due to less memory of their code being wasted, faster memory allocations, etc and when you combine that with source generators with reflections you can very easily see an 90% improvement in code (I sometimes see cases where I consume an closed source library in my projects that is written back in the days before System.Memory was a thing, and source generators was a thing and using Mono.Cecil to decode it to IL and patch it accordingly to use System.Memory and replace all usages of reflection that is for calling private or internal members for some things that can now be accessed in public versions of those members) actually seen about 90% improvements in execution times.

I would also like to take some time to mention Rick Brewser (creator of Paint.NET) who might agree with me on all of this @rickbrew.

charlesroddie commented 2 years ago

Discussion has moved beyond source generators so I will just make a brief reply to @dsyme with links.

why not create a tooling RFC under https://github.com/fsharp/fslang-design/tree/main/tooling where you can together document and record the whole matter of F# and jit-free AOT

Last writeup from 3 years ago is https://github.com/dotnet/corert/issues/6055#issue-338539392 . NativeAOT will be more compatible now (@kant2002 has tried it recently) and at some point I will do an up to date summary. The largest issue is string functions documented here: https://github.com/fsharp/fslang-suggestions/issues/919 . That will unblock running the test suites (https://github.com/dotnet/fsharp/pull/5340).

F# mostly supports AOT by virtue of the compiler being mostly sensible. I think this can largely done by the community with some helpfulness from the F# team: willingness to take small behavioural changes, treat the ability to run performance F# apps on user devices as having some importance, not repeat the F#/UWP debacle.

My fundamental belief is that .NET is simply not a statically compiled system instead it's "majority statically compiled with occasional JIT... It's a pipe-dream ... majority of .NET coding be without JIT and reflection... the programming model ends up so compromised

F# apps are already running without JIT and with minimal reflection across iOS, Android, Windows and Mac devices, via MonoAOT and netnative. Very soon they will be running on web browsers via wasm. Really almost no constraints for writing AOT-friendly apps in F#. The nugets we want to use all work, and one just needs to take care about string functions in practice. You will get much more potential for nuget incompatibilites resulting from source generators.

charlesroddie commented 1 year ago

I would find it helpful if people started to link examples where they encounter C# source generators in practice in nuget packages they wish to use where there's really no good easy replacement for the technique where they think the use of source generators will "stand the test of time"

Regex source generators

Usage example: https://www.meziantou.net/regex-source-generator.htm

// The Source Generator generates the code of the method at compile time
[RegexGenerator("^[a-z]+$", RegexOptions.CultureInvariant, matchTimeoutMilliseconds: 1000)]
private static partial Regex LowercaseLettersRegex();

public static bool IsLowercase(string value)
{
    return LowercaseLettersRegex().IsMatch(value);
}
kasperk81 commented 1 year ago
[EventSource]
[GeneratedRegex] // renamed from [RegexGenerator] in 7.0 rc1
[JSExport]
[JSImport]
[JsonSerializable]
[LibraryImport]
[LoggerMessage]

are provided by runtime libraries, and they avoid aot-incompatible reflection apis.

roboz0r commented 1 year ago

I looked at some of the samples for C# source generators and they seem to be a C# and not a .NET feature as in the sample implementations they are literally doing C# source string concatenation. With that in mind I see it as a fool's errand to attempt to expose source generators without ever having to look at C# code.

The two general cases I see is code augmentation of partial classes (e.g. automatic json serializers) and total generation of values that implement a known type signature (classes / interfaces / functions).

Total Generation from Type Signature

For this case I think we can take some inspiration from the way Fable does native (JavaScript) interop:

// The member name is taken from decorated value, here `myFunction`
[<ImportMember("my-module")>]
let myFunction(x: int): int = jsNative

[<Import("DataManager", from="library/data")>]
type DataManager<'Model> (conf: Config) =
    member _.delete(data: 'Model): Promise<'Model> = jsNative
    member _.insert(data: 'Model): Promise<'Model> = jsNative
    member _.update(data: 'Model): Promise<'Model> = jsNative

jsNative is a special value with no implementation but instead signals to the compiler to replace the implementation with generated native code. In the .NET case, the "native" code would be IL and F# already supports inserting IL as an implementation. Normally it is used extremely sparingly but I see no reason the use couldn't be expanded to inject IL generated from source generators.

I would propose that csNative becomes a similar special value that allows the F# compiler to insert an implementation of IL based on the output of a C# source generator. Hopefully it wouldn't be too difficult to forward the attributes and type signatures to C# (Roslyn?) and subsequently dissect an assembly to insert the IL.

For the LibraryImport case in C#:

[LibraryImport(
    "nativelib",
    EntryPoint = "to_lower",
    StringMarshalling = StringMarshalling.Utf16)]
internal static partial string ToLower(string str);

would become:

[<LibraryImport(
    "nativelib",
    EntryPoint = "to_lower",
    StringMarshalling = StringMarshalling.Utf16)>]
let internal ToLower(str: string):string = csNative

The RegexGenerator case would be similar except that the generated IL is a class that inherits Regex instead of a function:

// The Source Generator generates the code of the method at compile time
[RegexGenerator("^[a-z]+$", RegexOptions.CultureInvariant, matchTimeoutMilliseconds: 1000)]
private static partial Regex LowercaseLettersRegex();

becomes

[<RegexGenerator("^[a-z]+$", RegexOptions.CultureInvariant, matchTimeoutMilliseconds: 1000)>]
let private LowercaseLettersRegex(): Regex = csNative

Code Augmentation of Partial Classes

In this case as @dsyme suggests here we have a ready mechanism with type providers. e.g. for STJ source generation

type internal SourceGenerationContext = CSharpProvider<"""
[JsonSourceGenerationOptions(WriteIndented = true)]
[JsonSerializable(typeof(WeatherForecast))]
internal partial class SourceGenerationContext : JsonSerializerContext
{
}
""">

The challenge here comes where WeatherForecast is defined in the same project as SourceGenerationContext. In this case you would need to implement switching between F# and C# compilation with shared assembly information but for the first-pass implementation I don't think that's necessary.

A much simpler case would be where WeatherForecast is defined in an external assembly so all the type information is already available or where WeatherForecast is also defined in the C# as inner classes:

type WeatherForecastOuter = CSharpProvider<"""
public class WeatherForecastOuter
{
    public class WeatherForecast
    {
        public DateTime Date { get; set; }
        public int TemperatureCelsius { get; set; }
        public string? Summary { get; set; }
    }

    [JsonSourceGenerationOptions(WriteIndented = true)]
    [JsonSerializable(typeof(WeatherForecast))]
    internal partial class SourceGenerationContext : JsonSerializerContext
    {
    }
}
""">

We could even embed C# syntax highlighting and auto complete as is done for html templates with Fable.Lit but that's certainly not required and is probably mostly an editor-level feature rather than a compiler feature.

mrange commented 1 year ago

@roboz0r - I admit I don't fully understand your proposal so I am assuming you already considered it but here comes my question:

With C# source generators the C# classes are generated as partial classes which allow the user to inject behavior using partial methods. So if a source generator generates C# and it's injected into F# as IL how can I use partial methods to inject behavior in the generated code?

roboz0r commented 1 year ago

@mrange The idea is that anything to do with partial methods or classes would have to be included within the CSharpProvider<C# string>. Until F# supports partial in general I don't think it can be any other way and trying to design partial into F# at the same time as source generators seems like a mistake.

I think the order to create something like this would be:

  1. Consuming source generators that involve entirely C# code with no dependencies embedded in an F# project

In this first case the embedded C# is entirely unaware that it lives within an F# project, and F# has no input to what happens in C#-land besides expecting that some IL will pop out eventually. The "Total Generation from Type Signature" case is a bit of a departure from this ideal but considering that type signatures and attributes are entirely static they should be far easier to forward / transpile into C#.

  1. Source generators that take dependencies / type information / partial implementation from separate projects / dlls
  2. Source generators that take dependencies / type information from the current project
  3. Add partial to F# and allow partial implementations to cross the boundary from the current project

2, 3, and 4 could be a long way away and potentially in a different order but if 1 can be completed that gives us a way forward to consume the most obvious use cases for source generators so far without major changes to F#.

jkone27 commented 1 year ago

is this coming soon? Since the big support now for source generators for many things in C#, like e.g. json serialization for example, https://github.com/amis92/csharp-source-generators should be good to have this compatibility in F# too?

The two concrete use cases that immediately jump out to me are:

vzarytovskii commented 1 year ago

is this coming soon? Since the big support now for source generators for many things in C#, like e.g. json serialization for example, https://github.com/amis92/csharp-source-generators should be good to have this compatibility in F# too?

The two concrete use cases that immediately jump out to me are:

No, it's hasn't been planned yet, and will definitely not going to make it in 8.

Don't get me wrong, we want to have it, but it's a huge feature.

Designing it alone will probably take months, since all source generators rely on roslyn (and C#-only features), and will likely involve us using roslyn one way or another.

We will also have to sort VS story out, they're slow enough natively in roslyn, and we will need to account for it too.

We're not even sure how to approach it, since there are multiple existing flavors as well as new versions which are being designed now.

Latest thoughts are here: https://github.com/dotnet/fsharp/issues/14300