dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.05k stars 4.04k forks source link

VB rename refactoring should affect class member and interface member independently. #21339

Open ericmutta opened 7 years ago

ericmutta commented 7 years ago

Version Used:

Visual Studio 2017 Community (v15.2)

Steps to Reproduce:

Type in the following code in a Console project (though any kind of project will do):

Module Module1
  Public Interface ITest
    Property Foo As Integer
  End Interface

  Public Class CTest
    Implements ITest

    Private Property Foo As Integer Implements ITest.Foo
  End Class

  Sub Main()
  End Sub
End Module

In VB the property CTest.Foo is allowed to have a different name yet still implement ITest.Foo as follows:

    Private Property mFoo As Integer Implements ITest.Foo

This is very handy because you can use coding conventions such as prefixing private members with "m" yet still implement interfaces. The rename refactoring logic doesn't seem to understand the fact that the class member CTest.Foo can be renamed to CTest.mFoo and still implement ITest.Foo, so if you right-click one member and rename it, it renames the other too.

Expected Behavior:

One should be able to invoke rename on ITest.Foo or CTest.Foo and have the operation complete without changing the other name as well. I suggest the following rules:

  1. When renaming a class member that implements an interface member, you should NEVER rename the interface member as well (remember interfaces can be used by external code and renaming members will break that code).
  2. When renaming an interface member Foo, you should rename class members that implement that member only if the class members use the name Foo as well (in this case you are assuming the user wants the names to be in sync for clarity and simplicity, which is a reasonable assumption). If the class member is called something different to the interface member (e.g mFoo) then it should NOT be renamed.

Rule number 1 is unavoidable and should be implemented to prevent bugs. Rule number 2 could be made a setting like "Sync class/interface members during rename" so users can choose the current behaviour by turning on the setting, or the behaviour descibed in rule 2 by turning off the setting.

Actual Behavior:

When you right click CTest.Foo and choose rename, then change it to CTest.mFoo you find that ITest.Foo is also renamed to ITest.mFoo. Having noticed the error, if you rename ITest.mFoo back to ITest.Foo you find that CTest.mFoo also goes back to being called CTest.Foo. The only way to break the coupling is to temporarily comment out the Implements clause, rename the class member, then uncomment the Implements clause.

Pilchie commented 7 years ago

@jasonmalinowski @DustinCampbell do you remember the thinking here? Any thoughts on the proposed changes here?

@kuhlenh FYI.

DustinCampbell commented 7 years ago

The thinking at the time was, yes, we should do that. :smile:

ericmutta commented 7 years ago

Hi @DustinCampbell kindly clarify...the thinking at the time was that you should do something like the suggested rules...? If so, what snags did you hit that made you leave it the way it behaves now?

CyrusNajmabadi commented 7 years ago

I'm amenable to changes here. However, i honestly don't know if i agree with:

When renaming a class member that implements an interface member, you should NEVER rename the interface member as well (remember interfaces can be used by external code and renaming members will break that code).

First, I rename class members all the time that implement interface members and i like and want that the interface member is renamed.

Second, if we make these changes we now have a situation where rename is asymettric. i.e. the location that you invoke rename now matters, and users need to know that that's the case. Currently, that's now how rename works. You can rename from anywhere and all the cascading means that you should be getting the same set renamed no matter what.

Third, because this is a change in behavior, i think it would have to come with an option to allow users to control this. Otherwise we'll just get people reporting bugs saying "In VS2015 i was able to rename methods from the implementation point, and it all worked properly. Now i rename the method and my names get out of sync. "

Given the lack of any feedback here until now I would keep the behavior as it is now, but potentially include an option for someone to say that they don't want interface members to be renamed if renaming an implementation of that member.

DustinCampbell commented 7 years ago

@ericmutta: Sorry, for any confusion. I think I misread the issue here. I was thinking that we weren't already renaming the interface method and was advocating that we should. :smile:

I actually disagree with the proposal to stop renaming the interface method. I'm in agreement with all of @CyrusNajmabadi's points.

ericmutta commented 7 years ago

@CyrusNajmabadi: Given the lack of any feedback here until now I would keep the behavior as it is now, but potentially include an option for someone to say that they don't want interface members to be renamed if renaming an implementation of that member.

That part at the end about having an option is what I am really hoping for! Here is an example (perhaps an extreme one but still very real) that shows a scenario where you absolutely do NOT want the current "symmetric renaming" behaviour:

  1. Load that giant solution that contains all the .NET framework base class libaries. Outside Microsoft we get it on the reference source page, though unfortunately it doesn't build because there are some dependencies missing, so I couldn't actually try this on my side.

  2. Add a new VB class library project (since the symmetric renaming behaviour happens in VB) and paste in the following code:

Module WorldEnder
  Private Class Armageddon
    Implements IEnumerable

    Private Function GetEnumerator() As IEnumerator Implements IEnumerable.GetEnumerator
    End Function
  End Class
End Module
  1. Build the solution and everything should be fine, you've added a class that doesn't do anything and is not even visible outside its assembly.

  2. Pray for forgiveness, kiss loved ones goodbye, then proceed to the next step.

  3. Invoke the rename refactoring on Private Function GetEnumerator() so it ends up reading Private Function MyGetEnumerator() (i.e we've added the letters MY at the beginning of the name).

  4. Since we have symmetric renaming in VB and you renamed a method implementing IEnumerable.GetEnumerator and the rename refactoring works across projects in different languages, the renamer will go ahead and find the relevant C# file then change IEnumerable.GetEnumerator to IEnumerable.MyGetEnumerator to keep it symmetric.

  5. You have now broken every single .NET assembly compiled in the last 17 years because IEnumerable is such a core part of .NET (all For Each loops use it) and the VB renamer decided to keep its name in sync when you renamed a PRIVATE method in a PRIVATE class that doesn't even do anything.

Granted the above horror scenario could never really happen for the .NET base class libraries since you would know immediately that something is wrong when you go to check in the source files you just added only to find there's hundreds, possibly thousands, of other files that need to be checked in because a member of the pervasive IEnumerable interface was renamed.

...but it can happen in pretty much any project because while humans know that renaming interface members is a breaking change and we avoid it, the VB renamer doesn't currently consider that its symmetric behaviour can cause breaking changes.

I don't know about you guys, but for me the idea of breaking other people's code by renaming PRIVATE methods in my code doesn't feel right at all.

At the very least, I suggest we have a prompt. When you rename a member that implements an interface, it just asks if you want to rename the member on the interface as well, with the accompanying warning that you may break code.

Having a prompt instead of an option solves the problem while making implementation easier: you are essentially adding an if-statement somewhere and showing a Yes/No message box. You don't need UI to allow the option's value to be viewed, edited and saved.

Apologies for the lengthy response, but I hope I have given you food for thought! :)

CyrusNajmabadi commented 7 years ago

At the very least, I suggest we have a prompt. When you rename a member that implements an interface, it just asks if you want to rename the member on the interface as well

I'm amenable to that. Feel free to contribute this as well if you'd like!

CyrusNajmabadi commented 7 years ago

with the accompanying warning that you may break code.

All non-private renames are breaking changes. I don't think we need to state that as it's pretty clear :)

ericmutta commented 7 years ago

All non-private renames are breaking changes. I don't think we need to state that as it's pretty clear :)

Indeed...except here we have a PRIVATE rename indirectly causing a non-private rename. Feels like changing the channel on your TV and causing your neighbour's car to blow up...a nasty surprise to say the least 😄

ericmutta commented 7 years ago

I'm amenable to that. Feel free to contribute this as well if you'd like!

I certainly would! @pilchie once told me the same thing on another issue and what happened is, I went googling around the code trying to figure out where the change needs to be made. Since the codebase is foreign, this took long, someone interrupted me at the office and before you know it, you've wandered off 😞

I love that this thing is open source, however those of us outside Microsoft are at a distinct disadvantage when it comes to making code contributions, since we can't do what an intern would do there: walk up to a colleague and get a guided tour of the codebase on day one, then go off and figure out the rest ourselves.

It's also a bit challenging for us to figure out who we are talking to (and hence what questions we can ask) because many of your GitHub profiles simply list a Microsoft email address and nothing more!

Those challenges aside, I am going to pester you @CyrusNajmabadi until I can do my first pull request for the idea suggested above (having a prompt). So please help me get started by pointing me to the files where VB rename is implemented (doing a search for 'rename' on the Roslyn enhanced source view brings OVER SIX HUNDRED RESULTS and that "where do I begin" problem kicks in all over again 😞).

Pilchie commented 7 years ago

@jasonmalinowski and @dpoeschl can probably give some guidance about where to change things here. Unfortunately, both of them are on vacation today, but @jasonmalinowski should be back soon.

In the meantime, you might start by looking at the VisualBasicRenameRewriterLanguageService

ericmutta commented 7 years ago

Many thanks @Pilchie! I have signed the agreement, cloned the repo, built it and now running the tests as I write this. If all goes well, I will start experimenting with the renaming feature under the debugger to see what makes it really tick.

There were some minor issues (let me know if these should be reported seperately):

1) The DocuSign website does not appear to work with Chrome (I have the latest v60). I tried several times and the progress bar would load and get stuck. Doing it in Internet Explorer worked first time round.

2) The "Test.cmd" batch file for running tests seems to break if you cloned a repo to a local path containing spaces. I saved to D:\Open Source\rosyln and got errors about path 'D:\Open ' not being found. Changing this to D:\OpenSource\rosyln seemed to fix the problem.

Pilchie commented 7 years ago

The second at least is known at https://github.com/dotnet/roslyn/issues/20929. @jaredpar or @jasonmalinowski - do you know who to talk to about the CLA site?

ericmutta commented 7 years ago

@Pilchie Thanks for following up! Here are some more (will just collect them here as I find them and then later on could move them to dedicated issues if you advise me to do so).

1) With VS Community 2017 v15.2 (26430.16) all C# source files using the default keyword are reported as having a syntax error (e.g. many functions have the last parameter defined as CancellationToken cancellationToken = default and have red squiggles on the parameter name and the bracket just after default). The code still builds and runs, which suggests a mismatch between the compiler and IDE.

2) As per the instructions here I set my startup project to 'VisualStudioSetup.Next' then hit F5. In the isolated instance, any attempt to create a project causes Microsoft VisualStudio Composition CompositionFailedException and an error message: An exception was thrown while initializing part Microsoft VisualStudio LanguageServices SymbolSearch VisualStudioSymbolSearchProgressService. The debugger in the original instance stops on MefWorkspaceServices.GetService() with the expression service.Value being null.

Roslyn is a complex beast so I expect to struggle the first time round, but with your help, I will endure until my first PR :-)

Pilchie commented 7 years ago

We currently depend on Visual Studio 2017 version 15.3, which should address both of the issues above. Did you find documentation that 15.2 was necessary, or that's just what you had?

ericmutta commented 7 years ago

Did you find documentation that 15.2 was necessary...

Quite the opposite...there WASN'T any documentation saying Rosyln development requires v15.3 (which I only discovered exists when you mentioned it...upgrading right away!).

ericmutta commented 7 years ago

@Pilchie We currently depend on Visual Studio 2017 version 15.3, which should address both of the issues above.

Problem solved! I have been able to run the IDE under the debugger and figured out how the rename is triggered.

When renaming a method that implements an interface member, we end up in the InlineRenameSession constructor where can inspect the renameInfo parameter value using the Watch window. It is an instance of the SymbolRenameInfo class. That class has a RenameSymbol property of type ISymbol which tells us about the method being renamed.

The renameInfo parameter mentioned above, is retrieved from AbstractEditorInlineRenameService GetRenameInfo where I can see a number of validation checks, including some VB specific ones. These checks look at the symbol kind, language and other properties which sounds like what I need to do. Here is where @jasonmalinowski or @dpoeschl or anyone else could give me guidance:

1) Starting with an instance of ISymbol how do I navigate (via casting perhaps?) to an instance of SourceMethodSymbol so I check the syntax details to see if the method has an Implements clause?

2) Having determined that the symbol being renamed is a method that implements an interface member, how do I go about showing the actual warning prompt that the associated interface member will be renamed automatically which may cause breaking changes? There's a lot of async code in Roslyn and I don't know how modal dialogs fit into it.

Thanks for any help in advance!

As a side note, the AbstractEditorInlineRenameService is rather secretive. It uses private members that could be protected and virtual instead so language-specific implementations like VisualBasicEditorInlineRenameService can hook and customize its behaviour while reusing its logic. In fact there is a TODO comment in AbstractEditorInlineRenameService which says the VB specific changes should be in a subclass, but the way things are private right now, that would be hard to do without duplicating code.

CyrusNajmabadi commented 7 years ago

Starting with an instance of ISymbol how do I navigate (via casting perhaps?) to an instance of SourceMethodSymbol so I check the syntax details to see if the method has an Implements clause?

You can either use ISymbol.DeclaringSyntaxReferences or ISymbol.Locations.

It uses private members that could be protected and virtual instead so language-specific implementations like VisualBasicEditorInlineRenameService can hook and customize its behaviour while reusing its logic. In fact there is a TODO comment in AbstractEditorInlineRenameService which says the VB specific changes should be in a subclass, but the way things are private right now, that would be hard to do without duplicating code.

Feel free to make protected and virtual. There is no requirement that internal implementation details remain the way they are.

CyrusNajmabadi commented 7 years ago

how do I go about showing the actual warning prompt that the associated interface member will be renamed automatically which may cause breaking changes?

You can use INotificationService.ConfirmMessageBox

jaredpar commented 7 years ago

@ericmutta the limitation on not having spaces in the path is known and tracked here https://github.com/dotnet/roslyn/issues/20929

jasonmalinowski commented 7 years ago

You can use INotificationService.ConfirmMessageBox

Not sure we'd want a modal dialog there given our existing experience. In my mind I think this parallels our "rename overloads" flag or "rename in strings". It's a checkbox that sometimes applies, sometimes doesn't, and you probably want to persist the checkbox between invocations rather than prompting each time. I think just do what rename overloads does -- you can then just follow the code how that flag gets passed around and respected, and do the similar thing.

You can either use ISymbol.DeclaringSyntaxReferences or ISymbol.Locations.

You could also just use https://github.com/dotnet/roslyn/blob/aeeec402a2f223580f36c298bbd9d92ffc94b330/src/Compilers/Core/Portable/Symbols/IMethodSymbol.cs#L150-L157 and avoid having to tinker with source entirely.

ericmutta commented 7 years ago

@jaredpar: the limitation on not having spaces in the path is known and tracked here #20929

Awesome...perhaps a quick fix for now would be to add a note in the getting started guide to let people know...or better yet: a warning message in the PowerShell/batch script that fires off the tests?

@CyrusNajmabadi @jasonmalinowski many thanks for the pointers, I will experiment with different options and let you know. One other possibility for showing the warning is something like this: vs_modeless_prompt

I broke something in the IDE and that message showed just below the tool bar (highlighted in red in the picture above). It is non-modal and "out of the way", yet visible enough to be noticed and easily dismissed by clicking on the X to the right. What is the API that shows messages like that instead of via the traditional message box?

Ultimately Visual Studio, being as extensible as it is, could probably use the equivalent of the Windows system tray or Android's notification drawer/shade where plugins could post notifications that are visible yet unobstrusive and in an area with sufficient visibility that they will eventually be seen.

UPDATE: just discovered that the notification in the image above is called an InfoBar and there's documentation for various ways of showing notifications in Visual Studio.

ericmutta commented 7 years ago

Hi @Pilchie, just wanted to ask: what are the recommended hardware specs for doing Rosyln development? I have a laptop using an Intel Core i7 quad-core CPU and 16GB of RAM...and it's really struggling to handle this codebase (every action takes multiple seconds to complete, sometimes completely freezing the machine). Do I need more RAM? Or a faster disk?

Pilchie commented 7 years ago

@ericmutta That's about what most of us have, and it's not great for anyone. Unfortunately, there are some performance issues in the new project system and in msbuild that are exacerbated by the new project system that we're working through. @davkean can probably point you at some of the tracking issues.

What type of disk do you have? Is it an SSD?

ericmutta commented 7 years ago

@Pilchie What type of disk do you have? Is it an SSD?

I have two physical drives, my boot drive is an SSD and is where I put my own projects. Currently the rosyln folder sits on my D drive with is regular HDD. Windows Explorer shows the folder has about 7GB of files, but interestingly the ..\rosyln\src folder is only 200MB which could be cached entirely in RAM many times over. I was worried about stressing the SSD with the many intermediate artifacts generated by building such a large project, but will move the folder over to see if it helps (incidentally, I tried the new lightweight solution load, it didn't make a noticable difference).

@Pilchie ...there are some performance issues in the new project system and in msbuild that are exacerbated by the new project system that we're working through...

Sounds good, I am sure it will get better. In fact when poking around the hidden .vs folder created in the same folder as one of my solutions, I noticed a couple of SQLITE databases in there. Perhaps a future direction should ask: if Visual Studio was SQL Server, how would it handle management of all the data required to do a build and generated after a build?

Using relational database technology may help with the perf issues and allow some very interesting scenarios (e.g the metadata for the .net framework class libraries can be put in a database so that searching in the object browser is essentially a SELECT * FROM Types WHERE Assembly = 'mscorlib' AND Category = 'Class' AND Name LIKE '*foo*'...which would be screaming fast compared to what we have now).

Food for thought!

ericmutta commented 7 years ago

@Pilchie What type of disk do you have? Is it an SSD?

So I did a little experiment to see if the perf issues were due to disk I/O speed...and they are not! Using the free and open source ImDisk Toolkit (which interestingly uses .NET Framework 4.0) I created a 10GB RAM disk and copied the entire rosyln folder into it:

vs_ram_usage

In the image above you can see ImDisk using 7GB of RAM as expected because that is how large the rosyln folder is. Then you have Visual Studio, MSBuild and ServiceHub using the most RAM compared to other running programs. CPU is at 69% percent after just opening the solution.

Loading and unloading the solution is just marginally faster when using a RAM disk (the fastest possible disk you can get), but general usage is still slow (e.g right clicking in the editor takes a few seconds to show the context menu), and it appears building/editing massive projects is CPU bound.

Since 99% of the 12,000+ source files in the rosyln folder are rarely modified during a fix, it would probably help over the long-term if rosyln was able to persist compiler parse trees and symbol tables to an RDBMS so it can be reused between solution rebuilds/reloads while opening up entirely new opportunities (e.g. you would be able to build a utility to check naming rule violations in 30 seconds by writing a select query like SELECT * FROM InterfaceTypes WHERE Name NOT LIKE 'I*').

CyrusNajmabadi commented 7 years ago

@ericmutta The primary problems now are not how roslyn handles symbols or tree, but how it interacts with the project system and how UI thread bound so many project system interactions are. Roslyn is actually highly efficient with storing data. for example, all our indices are already associated only with files, and only need to be updated when files themselves change.

Unfortunately, you're working with Roslyn in a state where it has switched over toa new project system and is completely stressing the bounds of what the current project system code is capable of. A lot of work is happening to make this project system behave a lot better and be better at scaling to a project the size of roslyn.

once this happens, you should find that performance returns back to a good state.

CyrusNajmabadi commented 7 years ago

Since 99% of the 12,000+ source files in the rosyln folder are rarely modified during a fix, it would probably help over the long-term if rosyln was able to persist compiler parse trees and symbol tables to an RDBMS so it can be reused between solution rebuilds/reloads while opening up entirely new opportunities (e.g. you would be able to build a utility to check naming rule violations in 30 seconds by writing a select query like SELECT FROM InterfaceTypes WHERE Name NOT LIKE 'I').

Roslyn data is already persisted to per-file DB info. I would like it to be that our work against those indices could happen inside the DB itself. However, the total size in memory for the indices is not that much (a few tens of MB) and processing that information inside the process itself is extremely fast. For example, on my machine, searching for all types that match a search string can happen in around a second.

One issue with the DB approach is doing things like pattern matching. We want to be able to let the user search for something like VSW.OpF and have it find and return VisualStudioWorkspace.OpenFile as a result. Today that requires custom pattern matching.

ericmutta commented 7 years ago

@CyrusNajmabadi: once this happens, you should find that performance returns back to a good state.

Much appreciate you taking the time to clarify and I look forward to developments in that area (just learned that the project system is on GitHub too when Pilchie mentioned it in #22014).

@CyrusNajmabadi: Roslyn data is already persisted to per-file DB info.

That's really neat! Where are the databases stored and is it possible to write SQL queries against them (even unofficially)?

@CyrusNajmabadi: One issue with the DB approach is doing things like pattern matching.

Good point. I remember in a past project I used SQL Server CLR integration to write a function that would call into .NET's Regex class so I could do pattern matching in queries and stored procedures. Worked very well performance wise and with SQL Server 2017 going cross-platform I imagine it could be used to resolve that issue (if and when it becomes as light as SQLITE).

All in all, this is proving to be an excellent way to learn about Rosyln and I very much appreciate all your collective effort and time in helping us external observers/contributors understand it 👍 👍

CyrusNajmabadi commented 7 years ago

That's really neat! Where are the databases stored

The db is generally stored under the project directory in ".vs\Roslyn\v15\sqlite3\storage.ide"

and is it possible to write SQL queries against them (even unofficially)?

Not really. The data is stored in a form that is designed to be pulled into memory in a highly condensed form. It's almost all compressed binary data optimized for precisely the types of operations we're performing. The DB usage is to give us ACID, not really to provide a generalized query store.

and with SQL Server 2017 going cross-platform I imagine it could be used to resolve that issue (if and when it becomes as light as SQLITE).

Perhaps. But, in general, to make such sort of large changes, the goal would be to have substantial end user value provided by the change.

e.g. you would be able to build a utility to check naming rule violations in 30 seconds by writing a select query like SELECT FROM InterfaceTypes WHERE Name NOT LIKE 'I'

This is definitely interesting. However, it would also mean that our DB format becomes another API surface area for roslyn that we woudl then have to maintain. It might also substantially degrade both the performance of existing features, as well as consuming more memory.

ericmutta commented 7 years ago

@CyrusNajmabadi This is definitely interesting. However, it would also mean that our DB format becomes another API surface area for roslyn that we woudl then have to maintain.

I understand totally. Prior to Rosyln anybody who wanted such a thing would either be limited to using reflection on compiled binaries or have to write a parser themselves. Now with Rosyln it is possible for pretty much anyone to create and maintain a seperate "Rosyln SQL API". Maybe some day I will 😄