Champion "Replace/original and code generation extensions"

gafter commented 7 years ago

[ ] Proposal added
[x] Discussed in LDM
[ ] Decision in LDM
[ ] Finalized (done, rejected, inactive)
[ ] Spec'ed

See also

https://github.com/dotnet/roslyn/issues/5292 https://github.com/dotnet/roslyn/issues/5561
An implementation branch is at https://github.com/dotnet/roslyn/tree/features/source-generators
Use-cases by @alrz: Discussion: Code Generator Catalog
Spec: https://github.com/dotnet/roslyn/blob/master/docs/features/generators.md

svick commented 7 years ago

@amis92 My worry with that approach is that it might turn out that replace-original won't be actually useful for source generators, or worse, that it might be useful in some other form.

Adding replace-original means you add a feature that is not very useful on its own now, and it potentially litters the language with a feature that might not be useful in the future, or it could limit future development of source generators.

If you add a feature piece by piece, you have to be very careful with the first few pieces (which is for example why o is (int, int) doesn't work in C# 7.0, even though it's an obvious combination of tuples and type matching, to make sure tuple patterns can be added later). I'm not confident that can be done with replace-original, based on the current state of source generators.

Athari commented 7 years ago

@svick

My worry with that approach is that it might turn out that replace-original won't be actually useful for source generators, or worse, that it might be useful in some other form.

We have #regions which are considered undesired by VS analysis. We have delegate syntax for closures which is obsoleted by lambda syntax. We have reference tuples with no syntax support obsoleted by value tuples with proper language and framework support. We have a whole damn namespace of non-generic collections.

I don't see how adding barebones replace/original support will be any worse than adding #regions. I don't think there're any scenarios which don't fit into replace/original+order idea - all questions seem to arise from source generators and tooling. In the worst case, it'll be a feature nobody will ever see - unlike #regions which plague C# code to this day.

@eyalsk

if they would settle on a design that is only half good and/or cover half of the scenarios, it might not cover the scenarios you need and it might not be possible to change it later

I understand the reasons, but it's geting harder to be passionate about C#. C# version history:

Java that doesn't suck.
Generics, closures, iterators.
LINQ, syntax sugar.
DLR, co/contravariance.
Async/await.
Syntax sugar, syntax sugar, syntax sugar.
Tuples, syntax sugar, syntax sugar.

Switch to syntax sugar mode without revolutionary features which change the way I program happened exactly after switching to Roslyn which was promised to do the exact opposite - enable the team to implement revolutionary features faster. All this while Java gets lambdas and ~~~enumerables~~~ streams (badly broken thanks to type erasure, checked exceptions and improper boxing, but whatever), default interface methods. C# is still miles ahead of Java, but it's gatting harder and harder to laugh in the face of Java programmers. I miss that. Well, at least Java devs postpone major features as often as C# devs.

And let's be frank, most people don't really care about pattern matching and record types. These are "cool" features to have, but they won't change code much in a lot of codebases. Source generators, macros and metaprogramming in general, on the other hand, completely change the way people program.

If C# doesn't get something revolutionary before Java gets value types and reified generics, it may be a sad day for C#. Let's not allow that to happen.

CyrusNajmabadi commented 7 years ago

@Athari (and everyone else), please keep the discussion on topic. This is not the place to discuss what you do or do not like about what the C# team has been picking to do for language releases. This is the place to discuss: Replace/original and code generation extensions.

These tangents actually hurt the viability of the feature because it distracts from being able to work toward addressing the very real concerns that exist with it, and make people spend time and energy discussing things that do not help resolve those issues.

vbcodec commented 7 years ago

@CyrusNajmabadi

Thanks for previous answer.

Can you explain why replace / original was dropped together with source generators ? I think that limited support for replace / original, that cover methods / properties, and single level 'inheritance' (without ordering dilemmas) could be very useful, pretty safe for further expansion, ant won't be perceived as 'low quality unfinished feature'. I know that you have priorities, and figuring out how to make these fine grained source generators will take lot of effort and time (4 - 6 years, for X.0). But replace / original seems as unnecessary victim, and can be implemented pretty quickly.

CyrusNajmabadi commented 7 years ago

Can you explain why replace / original was dropped together with source generators ?

I'm not sure exactly what you're asking. Are you asking:

"Even if you dropped source-generators, why did you also drop replace/original?" or: "Why did you drop all pieces of SourceGenerator work that you were looking at?"

vbcodec commented 7 years ago

I'm not sure exactly what you're asking. Are you asking:

I asking "why did you drop simple and powerfull language's feature, together with non required, optional, complex and too demanding tooling ?"

CyrusNajmabadi commented 7 years ago

Because we weren't certain that the language feature was correct. We felt like it would be a bad idea to release something now, only to be potentially hamstrung by it a few years down the line if we wanted to do other things.

We also really aren't certain that the language feature is sufficient on its own, or provides enough value to existing customers in the absence of the greater source-generator work.

wwwlicious commented 7 years ago

Just wondering if the problem details or designs tried/rejected were available to read or discuss anywhere.?

I can understand in an abstract or superficial way what a minefield of problems this feature has for the tooling experience, but it would be really interesting to read I think and gain a deeper understanding.

jahmai-ca commented 7 years ago

So I've been annoyed with PostSharp's lack of netstandard support for the past 6 months (which is just now happening but has issues), so I researched alternatives and started wondering, "Why can't Roslyn do it?". Found my way here by many redirects from other issues.

What I'm reading here is the classic innovators dilemma. Big companies like Microsoft are too concerned in producing the whole picture up front, long term support and grappling with all the what-if's, rather than an MVP that may not be perfect and then iterating on it. I don't wish to assert that Microsoft is not doing it correctly (it's your team and you'll run your product how you see fit), but this is the kind of feature that Microsoft will never be able to keep up with smaller innovators like Sharpcrafter's if it's not prepared to take risks.

CyrusNajmabadi commented 7 years ago

If you're ok with someone just producing something that doesn't cover "the whole picture up front", then why not use the existing solutions out there? :)

We could definitely produce something with lots and lots of issues with it. And then what happens next? Do you adopt it, or are the issues too much of a problem for you? If you do adopt it, how do you feel if we continually break you every single week as we work on addressing those issues? What happens if you have invested hugely in it, and we make a change that absolutely tanks performance for you?

If you don't need or want us spending the time to get a real quality solution, then why do you need Roslyn to do something like this in the first place? Why not whip up something yourself and make it available to the community? :)

jahmai-ca commented 7 years ago

Oh I don't need Roslyn to do this - it just seems natural for a compiler to allow code injection rather than post-processing IL, which is why I asked Google the question. I look for solutions like this specifically because I don't have time to solve all the problems in the world, though there are times when it is damned tempting :-)

The other side of the coin is this. In the 5 years it takes you to develop something that answers all those questions, that's 5 years PostSharp, Fody (and maybe some new tool) have had to add new features. When someone looks for a solution to their problem, they may overlook the hard work you did in that time because you're now behind the market, at which point you may look back and ask if it was all worth it.

Speaking personally, all I am really looking for is a way to write some C# code in a source file, and then inject that code into methods / property accessors by use of Attributes. PostSharp does this, but they are taking ages to support netcore and netstandard, and there are sometimes issues in generated IL that need to be addressed with each release, which is why I was looking around. I just commented because reading this thread it really feels like the argument is already lost and no traction will ever happen, and that would be a real shame.

CyrusNajmabadi commented 7 years ago

One of the core purposes of Roslyn is precisely to allow others to build solutions, rather than requiring that every solution come through us. If we wanted all the solutions to just be MS solutions, we could be closed source not giving access to anything :)

The other side of the coin is this. In the 5 years it takes you to develop something that answers all those questions, that's 5 years PostSharp, Fody (and maybe some new tool) have had to add new features.

Great! That sounds terrific. I'm not sure why we'd be against that sort of thing happening. I'd like a thriving ecosystem full of people filling niches with good tools.

PostSharp does this, but they are taking ages to support netcore and netstandard

It's unclear to me how us entering this space would mean that we'd be any better about supporting said platforms. Indeed, if we did as you mentioned, and didn't "cover the whole picture" then it's totally possible that you'd just be in teh same boat with any of our solutions.

It really feels like the argument is already lost and no traction will ever happen

All that has been stated is that this is big and we'd like to do something good here. But we also literally have enough work to occupy the next decade or so for every person on this team. As such, we're going to have to decide continuously which set of work makes the cut, and which set does not. At any point in time there will always have to be a set of work which we are going to prioritize ahead of the rest.

Right now, we are in the process of making those plans. All while we also work on things like 7.1. And VS2017Update1 and so on and so forth. If we get to the point that we think we can do something for an upcoming release, we'll def let people know :)

jahmai-ca commented 7 years ago

It's unclear to me how us entering this space would mean that we'd be any better about supporting said platforms. Indeed, if we did as you mentioned, and didn't "cover the whole picture" then it's totally possible that you'd just be in teh same boat with any of our solutions.

I made an assumption here that part of the MVP for any solution would be internal mandate to maintain functional compatibility with these kinds of fundamental changes - but I admit I don't know how Microsoft internal processes work so this could be incorrect.

All that has been stated is that this is big and we'd like to do something good here. But we also literally have enough work to occupy the next decade or so for every person on this team. As such, we're going to have to decide continuously which set of work makes the cut, and which set does not. At any point in time there will always have to be a set of work which we are going to prioritize ahead of the rest.

That's understandable and it's a good problem to have :-) It actually sounds a lot like maybe this is not the right problem for the team to be tackling, given there are (mostly) working solutions to these problems already in the ecosystem, and a huge bucket of other stuff we all want out of C#.

CyrusNajmabadi commented 7 years ago

You can't have both:

maintain[ing] functional compatibility with these kinds of fundamental changes

while also not spending a lot of time thinking about:

long term support and grappling with all the what-if's

If you don't think long term, you are going to get something that boxes you into a corner. And you'll end up eventually just having to scrap it and start over. (And that's just another form of fundamental change).

:)

It actually sounds a lot like maybe this is not the right problem for the team to be tackling, given there are (mostly) working solutions to these problems already in the ecosystem, and a huge bucket of other stuff we all want out of C#.

I still think there's an area here where we could provide a huge amount of value. The problem is how huge it actually is. Work in this space has the chance to really disrupt our ecosystem with tons of people pouring efforts in adopting what we produce. We don't want to create something here and then find out in a couple of years that people have shifted millions of lines of code to it, and now find ourselves in the position of having tons of customers in pain because nothing scales well enough, and any change we want to make would be massively breaking for all of them :-/

jahmai-ca commented 7 years ago

Maybe to bring the scope in, the target audience should be tooling developers rather then application developers? Evolve the solution based on the problems presented with a much smaller group of consumers, instead of every developer with VS installed?

MgSam commented 7 years ago

@CyrusNajmabadi I get there are engineering challenges. I get they are really big and hard. What I do not get is why this feature goes from "planned in C#7.0" to "postponed indefinitely". The other C# 7.0 drop outs all seem to be on the table for near-term implementation. And all of them have an impact far smaller than the potential for this feature.

Did consumer demand drop? (No) Did some amazing new feature get thought up that wasn't on Mads' huge list of possible C# 7.0 features? (No, with possible exception of https://github.com/dotnet/csharplang/issues/164)

I don't know about Microsoft but in my company when I'm working on something hard that my customers want, I can't just decide I'll table it and maybe eventually possibly come back to it someday if I feel like it.

An "open source" language should not have 100% of feature prioritization happening behind closed doors with zero input from the community. "Trust us" is just not good enough. C# 7.0 should show why.

CyrusNajmabadi commented 7 years ago

Maybe to bring the scope in, the target audience should be tooling developers rather then application developers?

I'm not certain how that would work. We're an API. How would we distinguish 'tooling developers' vs application devs? Aren't all devs who use our APIs in essence a tooling dev? :)

CyrusNajmabadi commented 7 years ago

I can't just decide I'll table it

Literally everything we're doing is something that some customers want :)
One of the very core tasks we have is exactly to decide what things we will work on and what we won't. :)

We like to experiment. We started looking into an area because we felt we might be able to do something cool with it in a time frame that worked for us. We then let people know we were experimenting. However, as we looked deeper into it, we discovered it was larger and harder than we thought. We also had a lot of other things that the team felt was important. The combination of these factors meant that this didn't make the cut when we looked at what we would be committing to and what we would not be.

and maybe eventually possibly come back to it someday if I feel like it.

Basically, this prevents us from ever experimenting with things. If, the moment we experiment with something, we have to then commit to it, then that will just mean we just don't experiment :-/

jahmai-ca commented 7 years ago

I'm not certain how that would work. We're an API. How would we distinguish 'tooling developers' vs application devs? Aren't all devs who use our APIs in essence a tooling dev? :)

I'm referring to this statement:

Because the IDE experience won't meet the user experience bar we've set for language changes.

Plus, it will just be a terrible experience. Think about it this way, if there's no IDE support, and you have a code generator that generates symbols that you try to reference, then you'll get tons of error squiggles that never go away as you're trying to use the IDE.

If the target audience was different, then things like IDE support become less important. The idea being that the first step could be enabling tooling developers like PostSharp to hook into the compiler with source generation instead of IL weaving.

CyrusNajmabadi commented 7 years ago

Today that is already possible. In several of the discussions we've laid out how tools could use roslyn and generate the right source that they want. One can already use similar systems that act just as a tooling step using some system to analyzing and generating code. Those systems could already switch today to use Roslyn.

The primary problem is that such tooling experiences are invariably pretty poor. It does not feel like a cohesive experience and lots of rough edges are apparent. Either:

You're ok with that. In which case you don't need a Roslyn solution in the first place. You can just use one of those solutions (which can be updated to use actual Roslyn APIs if desired over working with IL).
You're not ok with that, and you want a good tooling experience. In which case, now we're back to where we are today with this discussion :)

jahmai-ca commented 7 years ago

The problem I am referring to is taking hand crafted code and injecting it via AOP into other code - you're saying that's already possible with currently tooling (albeit poorly)? I'd love some references :)

CyrusNajmabadi commented 7 years ago

Sure. Just write a tool that uses roslyn to analyze your compilations, do whatever sort of transformation/injection you want, and then have it compile the result. You can also have it spit out the modified trees if you want.

Roslyn already gives all the pieces for this. And i've used it myself on several occasions for this sort of task.

Of course, it really is just a stand-alone tooling type experience. You get no positive experience in the IDE. Debugging is terrible (unless you spit out and persist the transformed/generated files). etc. etc.

markrendle commented 7 years ago

Responding to something from a month ago, sorry.

Regarding the "generate-on-save" experience:

might work for automatically implementing interfaces or partial methods but won't work for things like Code Contracts or [NPC] attributes?
doesn't sound very VCS-friendly: do I check these generated files into Git or what? If not, how do I .gitignore them?

jnm2 commented 7 years ago

For the use cases I have in mind, I'd like the option of not emitting generated files to the file system at all. The reason for this is source control noise. Rather, syntax trees would be created in memory as the compilation happens. Debugging/go to definition would create the text on demand, just like I'm used to ReSharper doing when debugging/F12ing 3rd party library code to decompile. How feasible is this?

amis92 commented 7 years ago

Current proposal suggests both variants depending on mode of operation. When persisted, they would be persisted in a separate folder which would be ignored the same way /obj is ignored in VCS.

asdfgasdfsafgsdfa commented 7 years ago

@jnm2 It seems very plausable that this won't be much work. Seeing as vs already has support in its UI system for displaying pseudo-files (that have no actual backing in the file system) just as you described. And also don't forget that roslyn is able to perfectly do code <-> syntax-tree roundtrips.

@markrendle There are so many suggestions here where different solutions are the obvious one. That sounds to me like every generator should be able to just specify when it wants to rebuild its output: every keystroke, on save, on compile, periodically, only on manual trigger, ... maybe some of them even want to check for their own events (but then they could just use timer + checking their own events)

jahmai-ca commented 7 years ago

I have just learned from PostSharp that they are dropping support for Xamarin platforms in their next release v5 (citing high cost to support vs few consumers). They will continue to support Xamarin in their current release v4, but that doesn't support netstandard/netcore. That means if you have both Xamarin and netstandard in your code base (we do) then PostSharp is no longer viable for cross-platform AOP.

Pzixel commented 7 years ago

I'd like to put my 0.02$ in this topic.

I think this feature is really one of most wanted in the language, or even the most wanted one. We have tons of IL rewriters but they are working with binary files when we defenitly want to do it on AST level. Tools like Code Contract may really benefit - I have read tons of articles on MSDN about "why our CCRewriter is not working as expected"/"CCRewriter breaks my code again"/etc/etc/etc...

So in my imagination, it should work like Roslyn analyzers are working: you are creating a NuGet package, and then it gets injected into compiling process. You cannot address generated members from assembly where they are generated, but are fully able to do it from other assemblies (because these members are gotten compiled and exist on IL level). This restriction may be removed in future, but it simplifies a lot of things.

@CyrusNajmabadi So, answering questions above:

IntelliSense. Clearly people will want to generate items that then show up in IntelliSense. How does that happen efficiently?
Navigation. People will want to navigate to generated code. How do we do that, and what's the right experience overall?
Debugging. How do you debug through the code that is generated.
Project-System. How do you present generated code in the project system?
Refactoring. How do refactorings work properly in a context where some of the code they're analyzing was generated by previous code, but the changes they make may cause more transformations to happen?

My thoughts:

On assembly level you don't have an access to these methods, so intellisense remains the same. Other assemblies may use assembly information of compiled file (like resharper is doing with "go do decompiled sources").
Navigation is forbidden or performed like in p.1. Like in Roslyn analyzers, you just can't navigate to code part that produces an error/warning. Same is here. Those code generators shouldn't be complex enough to require anything but unit tests in its original solution. When it gets compiled, you may never want to examine result code - entire complex logic must be already tested in code generator's solution.
You don't have to debug it, code generator shouldn't generate complex code while code of generator itself should be tested as I said in p.2. It's ok for MVP, then restriction may be removed.
Project system knows nothing about it. It's ok like it doesn't know about Roslyn analyzers - they just works at background. Same here - these analyzers just works as last step of compilation. Again, it's ok for MVP to have it working in this way, and it also may be changed in future.
I don't see any troubles here. If you are refactoring assembly with these generators, generated code just doesn't exist (according to p.1). And if you are refactoring something outside, it doesn't matter because all code has gotten generated into binary and cannot be modified.

I may be wrong, but my ideas could make you invent something unexpected that may show green light to this feature

binki commented 7 years ago

@Pzixel I agree for point 1 and 5. All source editing, Intellisense, and things like Quick Actions should only see read-only pre-alteration-phase code. Doing anything else would be complex and confusing. This will require some codebases to use two assemblies when they need to directly consume generated code because directly calling generated code from non-generated code in the same assembly will be impossible (but people will likely use tricks such as virtual/abstract/interfaces to mitigate that for certain cases). And code generation that only modifies method or property bodies, a big use case (e.g., auto property notify/preconditions), would not suffer.

For points 2-4 (Navigation, debugging, project system), I do not agree. Even the simplest automated AST modifications could produce really weird runtime errors. It’d be great if, in addition to Source/Disassembly view, VS were to gain an AST visualization view similar to LINQPad’s Tree tab. This would even be beneficial today—even without codegen support from the compiler. If this could be optionally be emitted by the compiler in serialized form and possibly stored in the debugging (pdb?) files so that you can debug files wrt the final state of the AST before the compiler emitted IL, then the debugger GUI can use it. The idea would be that if some IL isn’t associated with Source, it might be associated with a node in the AST which VS could visualize in a treeview wherein you can set breakpoints, etc.

This thread is so long, I hope I didn’t repeat anything…

CyrusNajmabadi commented 7 years ago

All source editing, Intellisense, and things like Quick Actions should only see read-only pre-alteration-phase code. Doing anything else would be complex and confusing

If IntelliSense does not show you transformed code, your experience is going to be massively confusing. You'll be able to have code that calls other code, but that called code will never show up in in IntelliSense. This means someone trying to work with the code can be confused, and it also means your typing will be interfered with. Imagine if someone writes a log library with configurable groups. i.e. there's some text file that defines "Warning, Information, Debug, etc." and from that functions like LogInformation are generated. Now if you actually try to call those in intellisense, those items won't appear. At best this will just be confusing. At worse it means intellisense will select the wrong items from teh completion list if there are other items that potentially match those sorts of things.

masaeedu commented 7 years ago

@CyrusNajmabadi Is it possible that prototyping the user-interaction aspects of this feature would be easier if you started from a less complex consumer of Roslyn information, for example Omnisharp? If you could build a prototype that works well in Atom/VS Code, that would help inform the direction for how it might interact with Visual Studio.

masaeedu commented 7 years ago

@CyrusNajmabadi It seems to me like there's two orthogonal features getting mixed up together here:

The ability to rewrite function implementations syntactically
The ability to manipulate the symbolic representation of code in a way that the rewritten code can exploit (add/remove types and type-members, retroactively implement interfaces, etc.)

There are already several reasonable alternatives for the former that use IL-rewriting, and it is not impossible to imagine an MSBuild configuration that lets you perform the rewrite even at the AST-level, provided again that you don't care about accurate Intellisense within the rewritten code. This excellent project, for example, already lets you integrate a Roslyn rewriter into a project so that all dependent projects see the rewritten version.

The latter is currently not supported at all by Roslyn, and probably necessitates solving a bunch of thorny issues surrounding IDE experience. Being able to manipulate the symbolic layer is really the valuable feature, and I think it doesn't necessarily have to be implemented as a language feature. I'll probably get murdered for this, given how much demand there is, but I don't particularly like original/replace. I really just want an alternative to Fody, PostSharp etc. that:

Can be implemented against the AST instead of the IL, and
Works well with Intellisense

As a developer, I want to reference some NuGet package that implements INotifyPropertyChanged for every class I decorate with [MagicLib.NCP], and ensure that the rest of my code is aware of the newly introduced interface. That last part is what's missing today. Whether the package author implemented MagicLib using a C# 8 feature or some newly introduced compiler API or IL rewriting isn't relevant.

I suspect that if this was implemented as a compiler feature without any changes to the language, it would be easier to build some kind of language feature on top of it at a later date.

CyrusNajmabadi commented 7 years ago

If you want something without IDE support, then it's really not that hard. just add build steps that invoke some tool of yours to process your code (using the Roslyn APIs) and then just emit your in memory trees that you've generated.

:)

--

And ensure that the rest of my code is aware of the newly introduced interface.

What does this mean? Do you mean: in the IDE all the rest of your code knows about this interface? i.e. you won't get errors and whatnot? Or do you mean simply that your code can reference the interface and it will build properly?

If the latter, you can do this today. If the former, then it's very hard (and is precisely the issue we're discussing).

CyrusNajmabadi commented 7 years ago

I really just want an alternative to Fody, PostSharp etc. that: ... Works well with Intellisense

This is really hard. That's the overall thrust of the points i'm making here.

masaeedu commented 7 years ago

@CyrusNajmabadi The former, but without the additional complication of introducing a language feature. I want the ability to hook into and modify things at the symbolic layer of the Roslyn pipeline, through whatever means make this easiest to implement.

Regarding:

If you want something without IDE support, then it's really not that hard. just add build steps that invoke some tool of yours to process your code (using the Roslyn APIs) and then just emit your in memory trees that you've generated.

Yes, this is the point I was making when I referred to the existing IL-rewrite solutions.

CyrusNajmabadi commented 7 years ago

I want the ability to hook into the symbolic layer of the Roslyn pipeline, through whatever means make this easiest to implement.

That would definitely be awesome. We just think that this is hard and we'd need to spend a fair amount of time solving the problems. Nothing is insurmountable. But this will require a lot of man power to get good.

--

To give some context, our general perf bar for something like intellisense is <15ms. We've tuned our compiler pipeline to be able to provide good results in those sorts of times. The moment you allow arbitrary code to execute in the middle of htat (especially arbitrary code that might make any of our optimizations no longer valid) then you've got a hard problem on your hands.

masaeedu commented 7 years ago

@CyrusNajmabadi Okay, but is it not possible to introduce this in some tightly constrained way that does not violate the assumptions you are making? You don't have to give the user the power to introduce entire syntactic trees, then reason from there. If you give them some limited ability to modify the symbolic model (and you can introduce this incrementally, e.g. starting with only allowing new type defs), it should be much easier to preserve your invariants.

masaeedu commented 7 years ago

@CyrusNajmabadi I would imagine the general symbolic model isn't changing with every keystroke, since most time is spent typing inside function implementations rather than modifying or introducing types. You could cache the output from these "symbolic hooks" with respect to the input very aggressively.

CyrusNajmabadi commented 7 years ago

I would imagine the general symbolic model isn't changing with every keystroke

That's currently how Roslyn works today. Every keystroke produces a new symbolic model. So we'd have to really change things to make that not the case.

Okay, but is it not possible to introduce this in some tightly constrained way that does not violate the assumptions you are making?

That would be the goal. But that needs a design, and all the time to actually implement.

Again, i'm sure we could do something. It's just non trivial

masaeedu commented 7 years ago

@CyrusNajmabadi Interesting. I was under the impression there was some degree of memoization in the transformation from the syntactic to semantic models. Does this mean it is just as expensive to update the semantic model after typing a single character as after creating 10 new classes and 5000 lines of code? Would it be possible to efficiently diff semantic models?

CyrusNajmabadi commented 7 years ago

I was under the impression there was some degree of memoization in the transformation from the syntactic to semantic models.

Only for metadata references. Not for source. We generate all the symbols and all the binding information all over again. :)

Does this mean it is just as expensive to update the semantic model after typing a single character as after creating 10 new classes and 5000 lines of code?

Sorta. You'll aways have to reanalyze 100% of the code. That cost will be higher overall if you have a 1 million line project and you add 1 line versus 1 million lines. After all, in the latter case you now have 2 millions line to analyze.

But it's better to say: If you added 100 lines and removed 100 lines, it would cost as much as adding 10000 lines and removing 10000 lines. (Note: this is just referring to symbols/semantic analysis. We do have a fairly robust incremental parser that will do a good job reusing syntactic data across edits).

CyrusNajmabadi commented 7 years ago

Would it be possible to efficiently diff semantic models?

Maybe :)

We don't know. Attempts in the past to do this have never worked out well. Indeed, the old VB compiler used to do this and it turned out to be more expensive than just recomputing all the information from scratch. :D

mattwar commented 7 years ago

The replace/original part of the source generation feature is not the complicated part (it has a few design issues but relatively trivial compared to the typing in the IDE problem.)

Solving the typing problem is difficult but not impossible. The feature is not being worked on because of a lack of appetite to commit resources to solving the problem.

masaeedu commented 7 years ago

Hmm. It's hard to get a sense of scale for the problem without any concrete stats (perhaps I should go use Roslyn to analyze Roslyn 😄). I mean the symbolic model has an inherent hierarchy to it as well; you have files at the top, and classes in the file.

Only for metadata references

Is there a "semantic model" that exists as a distinct concept from symbols (i.e. after binding)? If this is what Intellisense works from, users could accomplish the same thing working with that, and the input size would be smaller still.

the old VB compiler used to do this and it turned out to be more expensive than just recomputing all the information from scratch

Was it doing this directly on the AST or did it have a concept of an intermediate "symbolic representation" to work from?

Solving the typing problem is difficult but not impossible. The feature is not being worked on because of a lack of appetite to commit resources to solving the problem.

I understand. I'm trying to figure out if there's any tractable subproblems. There's bits and bobs to this feature that would still be very useful on their own. E.g. providing a compiler API to only modify types that have been marked with a particular attribute. Only allowing registration of declarative transformations to the semantic model (instead of arbitrary C# delegates). There's a billion useful subfeatures you could extract from this, none of which suffer from the scaling issues being discussed here.

CyrusNajmabadi commented 7 years ago

Is there a "semantic model" that exists as a distinct concept from symbols (i.e. after binding)? If this is what Intellisense works from, users could accomplish the same thing working with that, and the input size would be smaller still.

No. There is not. IntelliSense works off of Symbols and the results of hte compiler binding code. It has do. If you have something like:

Some.Complex(expr => 
    With(lots => 
          new Of { 
               NestedCode = you => really(have - to, bind, all[this])).ToFigureOut(what, goes.here

CyrusNajmabadi commented 7 years ago

Was it doing this directly on the AST or did it have a concept of an intermediate "symbolic representation" to work from?

The latter.

It turns out symbolic diffing/tracking/incremental work is actually really hard :) That's because so much of hte symbolic system is not codified in your data structures, but instead in logic in how they're processed. So actually trying to do any sort of diff to figure out things becomes massively difficult. Take the most simple example: scopes. Scopes affect everything in the language. But they're all implicit. We don't have a scope data structure. So even figuring out if any of yoru changes affected scope is difficult. And even if you did figure out what scopes were affected, then figuring out what set of semantics would need reanalysis would be hard as well.

CyrusNajmabadi commented 7 years ago

E.g. providing a compiler API to only modify types that have been marked with a particular attribute.

Sounds like a good idea. Though there's still the question of "what can this modification do?"

Only allowing registration of declarative transformations to the semantic model (instead of arbitrary C# delegates).

What is a declarative transformation?

There's a billion useful subfeatures you could extract from this, none of which suffer from the scaling issues being discussed here.

Let's speak concretely.

masaeedu commented 7 years ago

...scopes...

Fair point :(

What is a declarative transformation?

I pass you (the compiler API) an inert object representing the transformation I want to make. You never call my code again.

CyrusNajmabadi commented 7 years ago

Describe this inert object. How does it work?

masaeedu commented 7 years ago

Though there's still the question of "what can this modification do?"

List of transformations your users want minus things that are unsupportable from an implementation complexity or performance standpoint.

Describe this inert object. How does it work?

Okay, here's an example off the top of my head. I pass you an dictionary mapping an attribute type to an interface type. Every type marked with the attribute and extending the specified interface is exempted from having to implement the interface members. You (C# team) can do whatever optimizations you like, because you are the only one running any code. Heck you could even deserialize this (from assembly metadata? config file?) instead of having it passed via the compiler API.

dotnet / csharplang

Champion "Replace/original and code generation extensions" #107