Champion "Replace/original and code generation extensions"

gafter commented 7 years ago

[ ] Proposal added
[x] Discussed in LDM
[ ] Decision in LDM
[ ] Finalized (done, rejected, inactive)
[ ] Spec'ed

See also

https://github.com/dotnet/roslyn/issues/5292 https://github.com/dotnet/roslyn/issues/5561
An implementation branch is at https://github.com/dotnet/roslyn/tree/features/source-generators
Use-cases by @alrz: Discussion: Code Generator Catalog
Spec: https://github.com/dotnet/roslyn/blob/master/docs/features/generators.md

gulshan commented 6 years ago

@jahmai A sample AOP for injecting start and end logs for methods may look like-

public template class LogTemplate<T> : T
{
    ${
        using System.Linq;

        var type = ``T``.GetNamedTypeSymbol()
        var publicMethods =  type.GetMembers().Where(m => m.Kind == SymbolKind.Method);

        foreach (var method in publicMethods)
        {
    ``
    override $method.FullSignature
    {
        Logger.Log($method.Name + " start");
        base.$method.Name();
        Logger.Log($method.Name + " end");
    }

    ``
        }
    }
}

Then it has to be used like-

var myLoggedObject = new LogTemplate<MyClass>();

@LokiMidgard I agree with you that templating/code generation not integrated into language will not be able to do it conveniently. Even with my proposal, the same class cannot be used. Syntactically template class has to act like a generic class as per my proposal.

jahmai-ca commented 6 years ago

@gulshan Which is not the use-case I am interested in. In this way, you have to heavily modify the original code to inject the template code. See PostSharp and Fody for a preferred way to achieve AOP code injection.

I think the big challenge with this issue is it has a number of very interesting use-cases rolled into one feature, which if only partially implemented to satisfy one audience would alienate many.

masaeedu commented 6 years ago

-1 IMO for language-level code generation features. No matter what stilted templating metalanguage you embed into c#, it will be less flexible than being able to operate on the entire assembly using arbitrary code in an arbitrary language. For that purpose we need better support in the build pipeline and compiler apis for injecting and rewriting code.

On Nov 25, 2017 6:20 AM, "Gulshan" notifications@github.com wrote:

@jahmai https://github.com/jahmai A sample AOP for injecting start and end logs for methods may look like-

public template class LogTemplate : T { ${ using System.Linq;
    var type = ``T``.GetNamedTypeSymbol()
    var publicMethods =  type.GetMembers().Where(m => m.Kind == SymbolKind.Method);

    foreach (var method in publicMethods)
    {
``
override $method.FullSignature
{
    Logger.Log($method.Name + " start");
    base.$method.Name();
    Logger.Log($method.Name + " end");
}

``
    }
}
}

Then it has to be used like-

var myLoggedObject = new LogTemplate();

@LokiMidgard https://github.com/lokimidgard I agree with you that templating/code generation not integrated into language will not be able to do it conveniently. Even with my proposal, the same class cannot be used. Syntactically template class has to act like a generic class as per my proposal.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/csharplang/issues/107#issuecomment-346934480, or mute the thread https://github.com/notifications/unsubscribe-auth/ADgPyBsO-B2YcUHfnGhX9R8kCAqjJmwXks5s5_gSgaJpZM4MBJbt .

Pzixel commented 6 years ago

@masaeedu see https://github.com/AArnott/CodeGeneration.Roslyn . I have done several wonderful ideas using this package. It allows you to do what you are describing: inject into compiler pipeline and do stuff. Unfortunately, you can generate new classes but you can't modify existing ones.

masaeedu commented 6 years ago

Thanks. I am actually currently using that package myself for things like xml type providers (we discussed it earlier in the thread). The lack of ability to modify existing types is the gap I'm hoping will be filled by improvements in the compiler API and build pipeline.

On Nov 25, 2017 7:13 AM, "zhukovskiy" notifications@github.com wrote:

@masaeedu https://github.com/masaeedu see https://github.com/AArnott/ CodeGeneration.Roslyn . I have done several wonderful ideas using this package. It allows you to do what you are describing: inject into compiler pipeline and do stuff. Unfortunately, you can generate new classes but you can't modify existing ones.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/csharplang/issues/107#issuecomment-346936928, or mute the thread https://github.com/notifications/unsubscribe-auth/ADgPyKEIql3raE8xiSIUATaSqtMXMRFWks5s6ARLgaJpZM4MBJbt .

npolyak commented 6 years ago

I do not understand why it is a problem to implement it. Give people the power to modify the language and then cherry pick the best ideas. You do not have to ensure that the IDE and Roslyn are backward compatible to every idea implemented.

CyrusNajmabadi commented 6 years ago

Give people the power to modify the language and then cherry pick the best ideas. You do not have to ensure that the IDE and Roslyn are backward compatible to every idea implemented.

This is already possible today. Fork roslyn. You now have the power to modify the language. You can ask for that modification to be accepted back through a PR, and Roslyn can decide if they want to accept those PRs.

jahmai-ca commented 6 years ago

Oh, a comment to remind me I'm subscribed to this issue :)

I came here from #3356 which was raised in June 2015 and closed/forwarded to this issue, which was 2.5 years ago. After 8 months of monitoring this thread without any concrete decisions being made I think I'll excuse myself and accept that PostSharp is the only real AOP solution for C# for the foreseeable future. Thank you all for the discussion.

npolyak commented 6 years ago

Question for Cyrus: I understand that I can create custom analyzers to accept the new features, perhaps I can even tweak the intellisense, but how do I modify the compiler that the VS uses in order to preserve the IDE experience? How do I make my changes available to others say, via a VSIX extension?

mattwar commented 6 years ago

All of Roslyn (c#, vb) is delivered as an extension to VS. Its all there in the repo. You can get daily builds from a myget feed and install them on your machine and you get the incremental compilers and new IDE experience (for the most part). Sometimes we even publish builds of other branches with more experimental work in progress, that you can install too. So its entirely possible to fork Roslyn, make changes to the compiler and the IDE code, and build it all, and use the VSIX that is produced to install your version of C# into VS.

npolyak commented 6 years ago

Good to know, thanks. But I still think it would be less trouble if you provide hooks to allow the modification of the syntax tree in the beginning of the compilation process, so that those hooks would also allow changing the intellisense and CodeFixes.

CyrusNajmabadi commented 6 years ago

so that those hooks would also allow changing the intellisense and CodeFixes.

Can you explain how that would work?

npolyak commented 6 years ago

Let me walk it back a bit. I would like a plugin that would allow to modify separately the compiler and and the intellisense. In that case instead of dealing with the whole Roslyn we can deal with a small part of the code. Should probably be easier to write and easier to distribute.

ufcpp commented 6 years ago

Like these? https://github.com/dotnet/roslyn/pull/23984 https://github.com/dotnet/roslyn/pull/24110

CyrusNajmabadi commented 6 years ago

Hey! Those look familiar! :)

On a serious note, this will give you an idea of how complex things can be do to try to do this work. And my work doesn't even try to modify the compiler. And on top of that, my work doesn't even have intellisense (i.e. Completion) support. It just has coloring, error squiggles, brace matching, and highlighting working. And that alone was 90k lines just for these small languages :)

CyrusNajmabadi commented 6 years ago

Let me walk it back a bit. I would like a plugin that would allow to modify separately the compiler and and the intellisense.

If you could provide a prototype that showed how this would work, it could be something that could be considered for sure! But it's still unclear to me how this would work. As you can see from the Regex/Json PRs, it's non-trivial to try to integrate into all these subsystems.

gulshan commented 6 years ago

Latest C++ Metaclassess proposal- http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0707r3.pdf

Some excepts: A metaclass is defined as constexpr function that transforms a read-only source meta::type to one or more generated target meta::types, and can express constraints, defaults, and more. For example:

namespace std::experimental {
    constexpr void interface(meta::type target, const meta::type source) {
        // we will describe how to write code to:
        //  - apply “public” and “virtual” to member functions by default
        //  - require all member functions be public and virtual
        //  - require no data members, copy functions, or move functions
        //  - generate a pure virtual destructor (if not user-supplied)
    };
}

... These are applied by the default metaclass program that runs the following at the end of the class definition after all other compile-time metaclass code (using __ because this is in the language implementation):

constexpr void __metaclass_finalization(meta::type t) {
    for (auto o : t.variables())
        if (!o.has_access()) o.make_private(); // make data members private by default

    bool __has_declared_dtor = false;
    for (auto f : t.functions()) {
        if (!f.has_access()) f.make_public(); // make functions public by default
        __has_declared_dtor |= f.is_destructor(); // and find the destructor
    }

    if (!__has_declared_dtor) // if no dtor was declared, then
        -> { public: ~this_class() { } } // make it public nonvirtual by default
}

notanaverageman commented 6 years ago

As Visual Studio 2019 is on the horizon, is there a chance that this will be included? Personally, if there would be one big feature in 2019, I wish it would be this one.

xoofx commented 6 years ago

Everything about the experience in a product like VS. That includes, but is not limited to:

@CyrusNajmabadi related to https://github.com/dotnet/roslyn/issues/19505 I would like to genuinely understand why the tooling experience is blocking. So I have a few questions:

IntelliSense. Clearly people will want to generate items that then show up in IntelliSense. How does that happen efficiently?

First, for many cases we described in the issue above, where the code generated is private, this is not really an issue (serializers, ORM...), as we don't need this feature.

But while I know a bit what it requires to have Intellisense in VS in general, I don't know/understand how it is currently integrated with Roslyn that would make this so hard to add. Let's say we modify the syntax tree, by adding a few public fields to a class, I thought that Intellisense is mainly using the Syntax tree of Roslyn to provide it, no? Could you explain precisely what is the problem of Intellisense to "happen efficiently" here?

Navigation. People will want to navigate to generated code. How do we do that, and what's the right experience overall?

Similar to the previous point. Many cases for generated code don't allow/require this (code is private, you don't want people to navigate to the code).

But for the cases where we would like to have navigation, we have the following potential usages for code generators:

1) Code generator that can generate plain new stuffs on the side 2) Code generator that can modify the existing code 3) Code generator doing both

For 1), we could create a syntax tree, dump the generate files for readonly purpose in some obj/generated/ folders. These files would not be used by Roslyn but are just a dump of what was added to the Syntax Tree.

For 2), we could modify the existing syntax tree. Let's say for example that we want to add a method to an existing type. We would probably do it in a similar way than 1). The difference might be that we would have to use a partial class (or a cascaded of nested partial classes if the class is a deep nested). Again, the generated code in the obj folder would be readonly, not read by Roslyn but only available for navigation/debugging.

So for both, what is exactly problematic? Is it that when we dump a syntax tree, the emitted code (the syntax tokens if we had to read them back) is not linked back to the generated syntax tree? (never looked at that in Roslyn, so I don't know) Could you explain why this is not possible or difficult or the experience would not work at all?

Debugging. How do you debug through the code that is generated.

Similar to 2)

Though one case where you don't care is when you modify the content of a method. Typical scenario is AOP, for example design by contract where you can have a bunch of assert/verification at the beginning/end of the method. The generated code would not be dumped to the disk and would not be navigable to. We would not step on it or be able to put a breakpoint.

Project-System. How do you present generated code in the project system?

A special virtual folder (like the one that we have for "References" or the "External Dependencies" you get for a C++ project) that would list the generated syntax tree files (that would likely be in obj/generated folder). We may want at some point to have "generated" sub folders for the sake of making it easier to read in case of many generated files... but in many cases, you could decide also to have one file per generator...

Anyway, question is, why integrating into the Project-System is problematic here? Is it a syncing issue? Its dynamic behavior?

Refactoring. How do refactorings work properly in a context where some of the code they're analyzing was generated by previous code, but the changes they make may cause more transformations to happen?

The code generators are running again. They don't care about what was generated previously. The refactoring could modify the generated syntax tree, but it is not important because it would be overwritten anyway (the files are somehow readonly so...)

CyrusNajmabadi commented 6 years ago

There are a ton of questions you've asked. Too many and too involved to answer at one point in time. However, i will dive deeper to give you an idea of the issues involved. Let's start with:

Let's say we modify the syntax tree, by adding a few public fields to a class, I thought that Intellisense is mainly using the Syntax tree of Roslyn to provide it, no?

First, IntelliSense (and most of t the other services you get inside VS) are not "mainly using the syntax tree". Indeed, by and large, intellisense itself only uses hte SyntaxTree just to get an idea about where 'completion' is being requested inside the code. i.e. is it after a dot, etc. etc. Once that has been determined, the 'heavy lifting' of IntelliSense takes over. Which is determining what is going on in your code semantically. And that involves having an understanding of every syntax tree in your project, all your referenced assemblies, and all the projects your current project references.

This 'semantic analysis' is not at all simple, and depends on a ton of analysis going on to figure out what's happening and then what should be offered. To give you an idea, just consider something as basic as:

var v = from x in y
        ...
        select z.  // <-- here

How does intellisense have any idea what 'z' means such that it can give appropriate results? It actually involves deep semantic analysis of that code, and potentially an enormous amount of code seen previously. For example, say one of the previous lines was let z = this.M() + w.First(c => c.Age > 21). So, semantic analysis will say "aha! 'z' is a variable declared in a previous query clause". But then consider all the work that has to come next. Everything on the right side of the = needs to be figured out. That means you need to figure out what this means (including inheritance). You have to figure out M (maybe it's a method... maybe it's a field of delegate type... maybe it's an extension...). Now you have to figure out what 'w' is and all the complexity that may involve. Then you have to find First. And maybe it's a set of methods. So now you need to involve the entire overload resolution infrastructure to figure out what's going on there. Then you need to take all of that, and + it. And now you at least know the type you can use to provide intellisense off of.

Now, one of the prime ways that we speed all this up is that we take advantage of some things we know very well about the language. For example, we know that if you're just editing inside a method, then that means that all the 'inter-method' info (i.e. information outside the method), cannot change. So, for example, in the above cases, we could figure out and reuse all the data we previously computed about things like "this" and "M()" and so on and so forth on subsequent edits.

Now, consider what happens if you can now generate arbitrary source for any reason based on any edit. In the above example, perhaps the source-generator may have spit out another partial-class for the class you are contained in. That partial class defined a different 'M' method than the one that would have been found absent the new partial part. This then changes the meaning of that entire line, which makes 'z' have a completely different type. In order ot realize that, we have to actually run the generator to see the difference it makes so that we can realize "oh, accurate intellisense here is to to show this other set of members."

CyrusNajmabadi commented 6 years ago

Anyway, question is, why integrating into the Project-System is problematic here?

I didn't say it was problematic. I said:

This work will require substantial resources to get through the tooling side in any sort of sensible and efficient manner. It requires coordination between many different teams as well. As such, while still super great and valuable, it's unclear what the right timeline/delivery avenue would be for it.

To give some idea on this. Roslyn itself has been spending many years (with many devs involved) just to move the existing project system over to https://github.com/dotnet/project-system. This work is still ongoing and will likely take many more years to complete.

Nothing about project systems turns out to be easy in practice. And my only point is that this area would need to be designed and addressed, and would likely take a bunch of work. That work would likely be considered necessary for this overall feature, and thus would have to be taken into account for the total costing.

CyrusNajmabadi commented 6 years ago

Refactoring. How do refactorings work properly in a context where some of the code they're analyzing was generated by previous code, but the changes they make may cause more transformations to happen? The code generators are running again. They don't care about what was generated previously. The refactoring could modify the generated syntax tree, but it is not important because it would be overwritten anyway (the files are somehow readonly so...)

This doesn't actually address the problem. Again, say that previously the user had written some code (say, something that called a non-existent 'Foo()' method). Source generators saw this, and somehow generated both an interface containing this method, and then spat out a partial type for some other class that now implemented this interface.

As far as the language is concerned, there is a relationship between it, the method in the generated interface, and the real class method that now implements that synthesized interface method. So what happens, if you rename that method? Should the class method be renamed as well?

And then, in general, when you do a refactoring, should the analysis be done on a Compilation where no source-generation has occurred? That's problematic because without source-generation the code may not be close to compilable, with tons of potential errors all over the place preventing accurate analysis. However, if you analyze the code post-generation, you're making assumptions about your analysis that may not hold up because you don't know what will happen once you make your actual changes and then the generators run again.

CyrusNajmabadi commented 6 years ago

Similar to 2)

Though one case where you don't care is when you modify the content of a method. Typical scenario is AOP, for example design by contract where you can have a bunch of assert/verification at the beginning/end of the method. The generated code would not be dumped to the disk and would not be navigable to. We would not step on it or be able to put a breakpoint.

That seems like a pretty bad experience :) I don't know if that sort of experience would be at all acceptable. My expectation woudl be that users would want to be able to properly set breakpoints even on source-generated code.

Indeed, that's why so much of this has been about source generation. Because it gives you an actual artifact that you can consider the 'truth' that can be used by so much of the tooling. There's no chance that when you run/debug your program that you hit random crap that doesn't exist in your actual source, because you always have the generated-source to use and step through itself.

xoofx commented 6 years ago

Now, one of the prime ways that we speed all this up is that we take advantage of some things we know very well about the language. For example, we know that if you're just editing inside a method, then that means that all the 'inter-method' info (i.e. information outside the method), cannot change. So, for example, in the above cases, we could figure out and reuse all the data we previously computed about things like "this" and "M()" and so on and so forth on subsequent edits.

Now, consider what happens if you can now generate arbitrary source for any reason based on any edit. In the above example, perhaps the source-generator may have spit out another partial-class for the class you are contained in. That partial class defined a different 'M' method than the one that would have been found absent the new partial part. This then changes the meaning of that entire line, which makes 'z' have a completely different type. In order ot realize that, we have to actually run the generator to see the difference it makes so that we can realize "oh, accurate intellisense here is to to show this other set of members."

That's a good point (though, many codegen scenarios generate code that is private not accessible/not intellisense-able by the users).

If the generator is part of the compiler pipeline, and we want to have a fast incremental compiling experience, it would have to take this into account with a proper fine grained notification/scope of what changed. If there is already an infrastructure for dealing with these incremental changes, we probably know exactly where they are, when they do trigger, what is their scope...etc. How many different optimized incremental we have like this? (edit inside/outside a method?...etc.)

About "running a generator" is not necessarily something like a huge process nor it is something that we should call blindly on every single changes. Typically for your example above, if you are modifying the internal of the method and the code generator doesn't have a look at that (like most of code generator would be), we should not have to run anything. From the scenarios we know, we could define a good first list of potential triggering safe points and plug the "running generator" there before proceeding further on other analysis passes.

Speaking of which, I would like to understand how does it work when editing async/await method: do the compiler performs the translation to the intermediate generated machine state code before performing an analysis? ( I mean in the case of VS code editing of course)

xoofx commented 6 years ago

This work will require substantial resources to get through the tooling side in any sort of sensible and efficient manner. It requires coordination between many different teams as well. As such, while still super great and valuable, it's unclear what the right timeline/delivery avenue would be for it.

btw, on this matter specifically: Why don't you form up a team dedicated to solve this code generator issue? With one person from Roslyn, one person from the VS editor integration, one person from the build-system, working closely together for solving this particular problem. That could even be a POC Team working for one month to provide a full functional prototype and evaluate what is/not working. Each person would still report back to their respective team and they could sync up (by maximizing the chances to solve this problem while minimizing the problems induced by the changes for their respective team)

orthoxerox commented 6 years ago

That's a good point (though, many codegen scenarios generate code that is private not accessible/not intellisense-able by the users).

So, like when you write empty method bodies and the generator replaces them with appropriate code? But if you are editing the same type for which private members were added, you probably would want to see the generated members, unless we all agree they are unreachable, like autoprop fields.

Perhaps it will make sense to split the whole feature into multiple tiers:

one would be about generating/wrapping member bodies for existing members and maybe adding invisible private members
another would be something like T4, F# type providers or EDM

xoofx commented 6 years ago

As far as the language is concerned, there is a relationship between it, the method in the generated interface, and the real class method that now implements that synthesized interface method. So what happens, if you rename that method? Should the class method be renamed as well?

About refactoring, you are taking a usecase that we don't have today for almost none of the various usecases that have been around for years (excluding AOP, but even AOP is not going that deep, it needs a code that is already compiled, so there is no magic happening in the middle of your method)

And then, in general, when you do a refactoring, should the analysis be done on a Compilation where no source-generation has occurred? That's problematic because without source-generation the code may not be close to compilable, with tons of potential errors all over the place preventing accurate analysis

I'm confused. Source code gen is not like a pre-compiler stuffs, it is part of the compiler like async/await machine state codegen is part of it. You run it just after the "hardcoded" compilation, and right before the analysis.

xoofx commented 6 years ago

About debugger:

That seems like a pretty bad experience :) I don't know if that sort of experience would be at all acceptable. My expectation woudl be that users would want to be able to properly set breakpoints even on source-generated code. Indeed, that's why so much of this has been about source generation. Because it gives you an actual artifact that you can consider the 'truth' that can be used by so much of the tooling. There's no chance that when you run/debug your program that you hit random crap that doesn't exist in your actual source, because you always have the generated-source to use and step through itself.

Why? How do you think the experience is today? We have to generate IL code after, usually when modifying the body of a method, we let the debug points stays the same. This is what has been used typically in SharpDX, the code is post-patched but we can perfectly step into it and the debugging experience is fine.

AOP is another case. You don't want your code to be "source generated". This probably where we are confusing features here. We are talking about compiler time code generation. If you generate some checks at the beginning of a method, we don't expect any way to put a breakpoint there for something that does not exist in the original source code. That's exactly what is happening for async/await: we generate a bunch of machine state code, but the code generated is not entirely debuggable right? So what is exactly the problem with that and why this experience is not good? (while we are actually depending on that already, either for async/await, or when we do IL post patching)

amis92 commented 6 years ago

This probably where we are confusing features here. We are talking about compiler time code generation.

@xoofx If you take a look at the spec linked in OP (https://github.com/dotnet/roslyn/blob/master/docs/features/generators.md), this issue is actually about source generation. :)

xoofx commented 6 years ago

@xoofx If you take a look at the spec linked in OP (https://github.com/dotnet/roslyn/blob/master/docs/features/generators.md), this issue is actually about source generation. :)

Indeed, realizing that it is slightly different, jumped into this discussion through a twitter link related to the issue https://github.com/dotnet/roslyn/issues/19505 😅

If we go into source code only, it will de-facto exclude all codegen scenarios that want to modify existing code as part of the compilation process. I would prefer that we invest our effort in a true, standardized, strong, compiler plugin integration story rather than this.

amis92 commented 6 years ago

If we go into source code only, it will de-facto exclude all codegen scenarios that want to modify existing code as part of the compilation process.

Why? That's exactly what replace/original feature would be for. To enhance and change the existing code.

ltrzesniewski commented 6 years ago

"replace/original" severely limits what you can do. A generalized code generation step in the compiler pipeline would enable much more codegen scenarios.

xoofx commented 6 years ago

Why? That's exactly what replace/original feature would be for. To enhance and change the existing code.

ok, re-reading generators.md. It is sharing many aspects on how to integrate this into the compiler pipeline, but concerning the source code gen, It should not be something mandatory. If you are doing AOP by translating attributes into some pre/post condition code running inside the method, you don't want to expose to the user a different method than the original one (even debugging wise, navigation wise... etc.)

amis92 commented 6 years ago

@ltrzesniewski

"replace/original" severely limits what you can do. A generalized code generation step in the compiler pipeline would enable much more codegen scenarios.

Can you elaborate? What would it limit? I'm out of imagination.

@xoofx

[...] It should not be something mandatory. If you are doing AOP by translating attributes into some pre/post condition code running inside the method, you don't want to expose to the user a different method than the original one [...]

I don't know. I've had a chance to use Fody and other weavers, and it's always a little... insecure, seeing something happen out of the blue, with no code to anchor it on. I understand that may be the case sometimes (e.g. no-one wants to step into generated auto-property accessors, or see iterators, or async machines in their expanded form), but in the general case, I'd be inclined to say that having the source is better than not having it, taking all possible applications into account.

xoofx commented 6 years ago

I don't know. I've had a chance to use Fody and other weavers, and it's always a little... insecure, seeing something happen out of the blue, with no code to anchor it on. I understand that may be the case sometimes (e.g. no-one wants to step into generated auto-property accessors, or see iterators, or async machines in their expanded form), but in the general case, I'd be inclined to say that having the source is better than not having it, taking all possible applications into account.

I don't disagree with that 😉 considering that it depends on the kind of codegen

ltrzesniewski commented 6 years ago

Can you elaborate? What would it limit? I'm out of imagination.

It requires you to mark points in your code, which the codegen will be able to replace. The proposal lists what you can and can't do.

Compare that to being able to manipulate the syntax tree directly in any way you want.

amis92 commented 6 years ago

@ltrzesniewski The only requirement on the code that will be modified is that the type has to be partial. As far as restrictions go, it's not really a big one. But other than that, only the generated code will need to use replace/original, and that doesn't seem like a big deal to me, although you may disagree.

There are "natural" restrictions of additiveness, that you cannot for example remove members, or change their signatures; that you could surely do when simply manipulating syntax tree directly. Is that the scenario you're interested in?

ltrzesniewski commented 6 years ago

Imagine a scenario where you'd like to automatically implement INotifyPropertyChanged in a WPF application. You'd have to mark almost all of your model classes partial in order to inject the interface implementation. This is an unnecessary burden, and it would also hide "real" partial classes.

In addition, accessing the original method adds one method call, which could end up not being inlined, which could matter in certain cases. But I guess you could "steal" the original AST and copy it over to your replacement method.

I'm not really interested in being able to remove code elements. The main thing I dislike about this proposal is that it provides limited support for code generation, and we may end up being stuck in the future because of these limitations (remember once you add a feature to the language it'll have to be supported forever).
The approach simply doesn't feel clean to me. I'd prefer being able to replace anything I want in the AST at codegen time, because that's a more sustainable approach and it feels right.

jmarolf commented 6 years ago

@xoofx

I would prefer that we invest our effort in a true, standardized, strong, compiler plugin integration story rather than this.

Could you submit a proposal of what this would look like? My understanding is that you essentially want to plug into an "optimization" step in the compiler so you can add optimizations that are specific for your scenarios (like this).

Thaina commented 6 years ago

I want to point out that, the CSharp language, or actually the programming language itself, always did some degree of codegen

Think about named Tuple or IEnumerable/yield or async/await. C# syntax does not always imperatively compiled into IL directly. Underlyingly we know it would sometimes be generated from syntactic sugar into more messy code

It just need to be designed in a way that could be reflected back into the syntax we making. In the same sense as the dynamic that still reflect back to DynamicObject.TryGetMember

xoofx commented 6 years ago

Could you submit a proposal of what this would look like? My understanding is that you essentially want to plug into an "optimization" step in the compiler so you can add optimizations that are specific for your scenarios (like this)

Instead of a proposal, I will try to find the time in the coming days to fork from https://github.com/dotnet/roslyn/tree/features/source-generators and prototype things from there with the project system and IDE intellisense story, in order to get a better idea about the required design

I will post an update for this to https://github.com/dotnet/roslyn/issues/19505

jmarolf commented 6 years ago

@xoofx That's even better. feel free to message me on https://gitter.im/dotnet/roslyn if you need any help (source generators hasn't been updated since the original prototype and there has been some drift)

CyrusNajmabadi commented 6 years ago

That's a good point (though, many codegen scenarios generate code that is private not accessible/not intellisense-able by the users).

Could you clarify that bit. For example, if privates are generated into the containing type, then that would be accessible/intellisense-ible by the user. If a new type is being generated, then that type will likely be accessible tot he user, etc. etc. Thanks!

CyrusNajmabadi commented 6 years ago

If there is already an infrastructure for dealing with these incremental changes,

Unfortunately, there is not. :) So if we needed to build such a thing, that would add a lot to the cost.

CyrusNajmabadi commented 6 years ago

About "running a generator" is not necessarily something like a huge process nor it is something that we should call blindly on every single changes.

That's a potential option. However, if you don't generate, you may totally risk the user being able to see the intermediate state. i.e. where they expect something to be generated, but as far as the IDE is concerned it doesn't exist. This may lead to an unacceptable user experience.

Typically for your example above, if you are modifying the internal of the method

Note: in the example i gave above, the modifications could be happening anywhere. But most specifically, they coudl be happening outside the method, thus changing the entire external 'semantic world', leading to a lot of excess computation vs what we have today.

and the code generator doesn't have a look at that (like most of code generator would be), we should not have to run anything. From the scenarios we know, we could define a good first list of potential triggering safe points and plug the "running generator" there before proceeding further on other analysis passes.

Yes. Greatly restricting the set of things that can be done would likely be a viable way forward. I think i mentioned that in some previous post (but it's been so long i don't remember at all). However, of course, such points would have to be defined, and we'd have to figure out if we could implement them effectively.

Speaking of which, I would like to understand how does it work when editing async/await method: do the compiler performs the translation to the intermediate generated machine state code before performing an analysis? ( I mean in the case of VS code editing of course)

No. And the reason for that is simple. The compiler transformation cannot affect the intellisense experience. THe compiler could do the current rewrite, or translate down to new IL intrinsics, or something else entirely, and it wouldn't effect what intellisense gets shown.

That is very different from Source generation/replacement. In that case, we are not limiting the feature to "only the changes that could not be perceived by the user program". It is very much "in scope" that the generator be able to make changes that are visible to the the rest of the program.

CyrusNajmabadi commented 6 years ago

btw, on this matter specifically: Why don't you form up a team dedicated to solve this code generator issue?

I do not have the authority to "form up a team" :)

I believe that those that do looked at the situation, and the costing from those who did look at it, and decided the costs were too high to do this now vs everything else that needs to be done. Right now, afaik, everyone is booked solid for the forseeable future. So, in order to do this work, that would mean not doing other work. The analysis of which work is most important, and what is most cost effective has already been done. And that's how we arrived here.

With one person from Roslyn, one person from the VS editor integration, one person from the build-system, working closely together for solving this particular problem

That's a lot of people (and it would likely need even more people). So we're talking about a half-dozen or more people just trying to figure this out. And, in the meantime, not doing the existing work that has been costed and deemed important to do right now. :)

That could even be a POC Team working for one month to provide a full functional prototype and evaluate what is/not working

This effort already happened. It's how we discovered how open this space is, and how many different areas there are that are impacted and need significant efforts to design and produce a viable solution :)

The POC was great for small scale stuff. But as people really started to use it, the problems in experience became clearer, and the costs rose greatly.

CyrusNajmabadi commented 6 years ago

Perhaps it will make sense to split the whole feature into multiple tiers:

one would be about generating/wrapping member bodies for existing members and maybe adding invisible private members

another would be something like T4, F# type providers or EDM

Indeed, if this was tiered, then we could set different expectations for the different types of generation.

Note that htis is not a strange concept in roslyn-land. The analyzer infrastructure is itself tiered. So, for example, you can do 'syntactic analysis' separated from 'semantic analysis' (or 'symbol analysis' or 'operation analysis').

I would be very open to this myself.

CyrusNajmabadi commented 6 years ago

About refactoring, you are taking a usecase that we don't have today for almost none of the various usecases that have been around for years

Yes. but you're asking for this to be part of Roslyn and the core C# (and presumably VB) experience itself. If we provide that, we're not going to just punt on major parts of the experience we already care deeply about.

If you want something that doesn't actually integrate with the services we provide today, then you can build that today without needing roslyn to do anything. Just have a pre-pass tool that takes your source and spits out whatever you want actually compiled. Then just compile that. Bam, done.

A reason to pull this into Roslyn is to make it a tight part of our entire system. That means it isn't just something bolted on that just allows generation, but which feels like it degrades the moment you use anything else in VS. It means it's supported through-and-through.

CyrusNajmabadi commented 6 years ago

Why? How do you think the experience is today?

If you already have a bad experience today with external tooling, and you're already ok with it, then you don't need Roslyn to do anything :)

The reason to be in roslyn would be to make this a first class feature that you could use effectively, without degrading hte entire experience around the lifecycle of your code.

We have to generate IL code after, usually when modifying the body of a method, we let the debug points stays the same. This is what has been used typically in SharpDX, the code is post-patched but we can perfectly step into it and the debugging experience is fine.

Can you explain that a bit more. Say the 'generated/injected code' itself faults. How does one debug htat? Say the 'generated/injected code' has inserted locals into the environment. How does one know their names? How does one look at them in the debugger?

Thanks!

CyrusNajmabadi commented 6 years ago

That's exactly what is happening for async/await: we generate a bunch of machine state code, but the code generated is not entirely debuggable right?

The code generated by teh compiler should not fault. If it does. that's bad. However, once you put source generators into the mix, avoiding htat will be extremely difficult. As such, being able to figure out what your generated code actually is so you can figure out why it's crashing is important.

dotnet / csharplang

Champion "Replace/original and code generation extensions" #107