Feature Request: Expose ability to query Roslyn about current solution in Visual Studio

YairHalberstadt commented 5 years ago

This issue is based on a discussion Gitter, starting at https://gitter.im/dotnet/csharplang?at=5c3fa0f820b78635b634d230

It is often the case that developers would like to gain insight into some specific metric about the code base they're working on.

Examples mentioned were:

Ratio of structs to classes
Ratio of static to instance methods

Other metrics might be

Number of classes/interfaces/structs etc. in a project etc.
Usages of some specific set of types/methods/ etc.
What is the greatest number of values an Enum has in the solution?

I have often been interested in similar such metrics, but have not found it worth the effort to create a ConsoleApp to do.

The thing that underpins all of these, is that they are relatively trivial to calculate given access to Roslyn, but rarely important enough to go to the effort of writing a ConsoleApp for. If they were able to be calculated directly from the solution in a couple of lines of code though, I think it likely that people would do so.

Intellisense uses Roslyn behind the scenes, and so much of this information should be available reasonably performantly in visual studio, if there were some way of accessing Roslyn from VS directly.

The C# Interactive window could be one means of exposing a subset of Roslyn APIs so that they could be queried from Visual Studio.

Access to SyntaxTree, SemanticModel, Compilation, Document, Project, and Solution objects would provide for a very reasonable Minimal Viable Product, where all would be contextually determined by the 'active window' or 'active project', or by the open solution.

It would also hopefully be possible to write extension packages on top of those APIs, which might make querying them more fluent via a Linq like set of APIs. This should currently be possible via Nuget, but it would help if the experience in the C# Interactive window surrounding Nuget could be improved a little, as I believe currently it's necessary to download the nuget package and reference the dll directly.

Whilst initially this feature could be readonly, to gain real power it ought to be able to alter the documents in the solution. This would allow Roslyn to be used as a find and replace on steroids, and would hopefully lessen the uses of horrible regexes when doing any reasonably complex Find And Replace.

Consider for example this script I hacked together to convert Fixie tests to Nunit: https://github.com/YairHalberstadt/FixieToNunit/blob/master/Source/Program.cs.

Simple as it is, it took many iterations to get right, and each iteration involved passing in the path to a file, project or solution, running the Console App, checking the changes in the VS Diff window, reseting all changes, and iterating. Being able to do this all without leaving Visual Studio, would make the whole process far smoother.

Does this sound like a reasonable set of ideas?

jnm2 commented 5 years ago

I keep a console app around for this but it's many additional tedious steps and nowhere near as nice as an integrated query window in the IDE.

The last query I manually implemented searched for all IMethodSymbols in the solution with a [Test] attribute and a Nullable<> parameter. Sadly I didn't save all the previous queries, but they were a mix of syntactic and symbolic approaches to answering various metric questions.

This is also not the first time I wanted something more powerful than regex for tedious tasks.

CyrusNajmabadi commented 5 years ago

Boiling it down, it seems like the primary goal woudl be to provide this:

The C# Interactive window could be one means of exposing a subset of Roslyn APIs so that they could be queried from Visual Studio.

Access to SyntaxTree, SemanticModel, Compilation, Document, Project, and Solution objects would provide for a very reasonable Minimal Viable Product, where all would be contextually determined by the 'active window' or 'active project', or by the open solution.

@jasonmalinowski @tmat How possible/difficult would it be to make it so that someone could access these VS in-memory objects from the C# Interactive window. To scope things down small, let's just start with:

Could we make it so that someone could access the VSWorkspace's .CurrentSolution from the interactive window easily?

This would open up a lot of "introspect your code on the fly" scenarios. And, if we can get .CurrentSolution, almost everything else can be brought along easily from there.

--

Note: i'm both a fan of this proposal, and of limiting exposure to just immutable objects for now. That limits the surface area of potential damage that someone can do :)

jasonmalinowski commented 5 years ago

I don't think it'd be too tricky; the only rub I see is the VisualStudioWorkspace lives in the devenv.exe process and the stuff you put in the interactive window lives in the InteractiveHost.exe project. Thankfully @heejaechang already has written the code to mirror a snapshot from one process to another, so I think it'd just be a matter of sticking all the pieces together. If I recall there's already support in the interactive scripting host for introducing variables that the host can provide.

CyrusNajmabadi commented 5 years ago

@jasonmalinowski Nice!

I don't think OOP should be too much trouble either. I already created the code to help roundtrip a symbol to/from OOP. So as long as the result can be computed on the OOP side, we should be able to marshal it back to the interactive side in many cases :)

jasonmalinowski commented 5 years ago

You wouldn't want to be roundtripping symbols, since the moment you want to write something like document.GetSemanticModel().GetOperations() you're now trying to figure out how to roundtrip IOperations and the like.

CyrusNajmabadi commented 5 years ago

@jasonmalinowski Do you think this would be a good idea? I personally like it a lot. One concern i would have though would be the implications of something now holding onto a snapshot for very long periods of time. I suppose it's no worse though than having any sort of extension that ends up doing the same though...

CyrusNajmabadi commented 5 years ago

you're now trying to figure out how to roundtrip IOperations and the like.

I'm curious what you think we should be doing in that case then. Disallow it? Marshal over some sort of remoting type? Rehydrate on the VS side (how OOP operations with symbols works today).

jasonmalinowski commented 5 years ago

I'd just mirror the entire Solution object with the code HeeJae already has; then it's local to the InteractiveHost process and you can use it directly. It would even mean things like Document.WithDocumentText(...., PreserveIdentity) would preserve identity if you were doing syntax annotations.

YairHalberstadt commented 5 years ago

Would it be possible to just send the text from the interactive window across to the Roslyn process, and compile and run the text there?

jasonmalinowski commented 5 years ago

Oh, to the Roslyn OOP process? I'm not sure, but that might work. @tmat?

CyrusNajmabadi commented 5 years ago

That would certainly be desirable to avoid needing another full copy of the Solution somewhere. They are often rather large at several hundreds of MB. Being able to just use the real hydrated OOP solution would def have pros.

tmat commented 5 years ago

This feature request is essentially asking for scripting of Roslyn object model from Interactive Window.

I agree that running the code in the process that has the data (Roslyn service process) would be the most efficient approach, however you might end up killing or corrupting the process by accident (e.g. accidental stack overflow). So I'd be hesitant to do that.

I'm not sure performance is super important in this scenario though. Why not create MSBuildWorkspace in Interactive Window as is? It can be done today, although it might be a bit complicated due to the lack of nuget support in #r and some other missing interactive features. But you can likely work around these and write yourself a csx file that does the necessary boilerplate and then #load it whenever you want to run some script against the solution.

CyrusNajmabadi commented 5 years ago

I'm not sure performance is super important in this scenario though. Why not create MSBuildWorkspace in Interactive Window as is?

I think the goal is to be able to run against the live object model. i.e. say you wanted to make a change, and then verify that. ideally without paying for several minutes of cost by instantiating a new MSBuildWorkspace that would need to perform all the same loading work over again. Note: for small cases, i can see this approach being ok. For larger cases though (i.e. introspecting Roslyn itself), it would likely be a lot of overhead that would make it undesirable.

tmat commented 5 years ago

@CyrusNajmabadi I don't see a complaint about perf in the initial proposal. Let's start with the simple approach that's possible today first and see how much performance is a problem. Then if it's important enough we can a look into how to make it faster.

jnm2 commented 5 years ago

Running the code out of process also has the benefit that you can eat up as many system resources as you please without being limited by VS. I could envision wanting to query something and being frustrated at poor performance. ("I bought RAM so that VS would use it; why isn't it using it?" etc)

tmat commented 5 years ago

@jnm2 We would definitely not run this in devenv process. I would say we don't want to run interactive sessions in Roslyn services process, which is less limited on resources than devenv, either due to potential for corruption/killing of the process by accident.

CyrusNajmabadi commented 5 years ago

either due to potential for corruption/killing of the process by accident.

Note: if we were concerned about that, i would think we could likely effectively trap/report this sort of thing. i.e. if the service process goes down because of a crash, we could trap that and blame the calling interactive entrypoint. Similarly, if it was taking a long time, we could name and shame the calling entrypoint.

But i get the desire to avoid that whole can of worms as well :) It just seems a little unfortunate as large solutions can take minutes to hydrate up fully. So having to incur that cost that was already paid seems unfortunate.

@CyrusNajmabadi I don't see a complaint about perf in the initial proposal. Let's start with the simple approach that's possible today first and see how much performance is a problem. Then if it's important enough we can a look into how to make it faster.

Sure. Although, it is worth pointing out that @YairHalberstadt is often delving into Roslyn.sln itself :) And, it would be great for the team to eb able to dogfood this. If it's the case that even asking a quesiton takes 3+ minutes for hydration to happen, that's less likely IMO.

Neme12 commented 5 years ago

I don't see a complaint about perf in the initial proposal.

Yeah, waiting 10 minutes for the solution to load isn't OK. How can you say "but nobody mentioned perf in the feature quest"? The whole purpose of this feature request is to save time on creating a new project, and be able to run queries on the current open solution quickly.

CyrusNajmabadi commented 5 years ago

I personally agree. It's effectively implicit in any sort of request like this that the perf be at some sort of reasonable level.

I think people can understand needing the time necessary for the solution to hydrate, plus maybe a little extra cost on that. However, doubling that time just seems excessive

tmat commented 5 years ago

I did not say waiting 10 minutes for solution load is ok. Not all solutions load 10 minutes though. You can write a prototype of the overall experience that works well on small solutions. Then we can look into hydrating the interactive process faster using data from a process that has the solution loaded already. To do so we need to see what's taking the most time. I suspect design time build does. So perhaps it would be enough to copy the results of design time build (projects, their options, source file and metadata reference paths) over to the other process and construct a workspace instance from this information. Maybe we will need to go further and copy the state of metadata reference binding, or declaration tables, etc. We would need to see what the profiler tells us.

jinujoseph commented 5 years ago

Design Meeting Notes Anyone can write an extension that loads the workspace and get the current sln , and go thru all the project info and dump the results into the json file and next time you can just read the json file to save time by using the same sln id , document id and create the sln out of it. This might be a first good approach to this problem and if we see the value proposition for having an inbuild sln then we should invest in it.

dotnet / roslyn

Feature Request: Expose ability to query Roslyn about current solution in Visual Studio #32556