dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.92k stars 4.02k forks source link

Why is CodeFixer run thrice to apply the actual fix instead of being cached? #32460

Closed ghost closed 1 year ago

ghost commented 5 years ago

Version Used: 2.6.3 Steps to Reproduce:

class StaticInheritanceAnalyzer 
{ 
    internal static ConcurrentDictionary<Location, DiagnosticData> DiagnosticDataDict =
            new ConcurrentDictionary<Location, DiagnosticData>();
} // gets poplated by all the necesary data to ensure static inheritance.
// also: can't use context.RegisterCompilationStartAction to clear DiagnosticDataDict. because it wipes data arbitrarily and unreliably.`

CodeFixer:

        private ConcurrentDictionary<Location, int> cacheReleaseCounter = new ConcurrentDictionary<Location, int>();

        private async Task<Document> FixInheritanceAsync(Document document, Location location,
            CancellationToken cancellationToken)
        {
            DiagnosticData diagnosticData;
            int counter = cacheReleaseCounter.GetOrAdd(location, 0);
            switch (counter)
            {
                case 0:
                case 1:
                    StaticInheritanceAnalyzer.DiagnosticDataDict.TryGetValue(location, out diagnosticData);
                    break;
                case 2:
                    StaticInheritanceAnalyzer.DiagnosticDataDict.TryRemove(location, out diagnosticData);
                    break;
                default:
                    throw new Exception(counter.ToString());
            }
            cacheReleaseCounter.TryUpdate(location, counter == 2 ? 0 : counter + 1, counter);
            // counter == 2 => fix gets applied, analyzer will finally produce a new Solution
            [... actual code ...]
        }
  1. Cache some DiagnosticData
  2. Try to release Data on first CodeFix run
  3. Run into the Issue that first Run is only used for the Preview
  4. Wonder what the next two runs do
  5. Be happy that it at least needs 3 runs consistently ( for what I've encountered so far)

Expected Behavior:

cached Data can be released by first call / CodeFixer needs to be run only once

Actual Behavior:

Codefixer has to be run thrice before Fix is applied

mavasani commented 5 years ago

Couple of things:

  1. Storing any data statically on an analyzer or code fix is not recommended. That is almost certain to cause memory leaks.
  2. Tagging @heejaechang who might know why we execute the code fix multiple times for preview - note that this is purely a performance question. Any assumption or functional dependency on code fixer being invoked just once on a given snapshot is incorrect and should never be made.
jmarolf commented 5 years ago

Codefixes need to be deterministic (same input result in the same outputs) in order for previews to work. What information do you need to cache? You can pass a dictionary of strings via the diagnostic to your codefix.

ghost commented 5 years ago

Well Hmm, I need some Values:

So is there any way I could convert Locations to string and back?

Concerning the memory leak: Instead of calling DiagnosticDataDict.Clear when context.RegisterCompilationStartAction is called. I call Clear when the Fix is applied the third time.. So it is just delayed.

"Any assumption or functional dependency on code fixer being invoked just once on a given snapshot is incorrect and should never be made." - ha, that made me laugh 😆 (There's probably some magic happening, but I think some fixes can also be scoped fixes (Node), so I don't have to return an entire Document or Solution, but simply a Node... but yeah, I'm eager to get to know more complex Design Patterns. that would be amazing 💪 🎉)

CyrusNajmabadi commented 5 years ago

So is there any way I could convert Locations to string and back?

You can just use '.AdditionalLocations' on the diagnostic. Lots of analyzers/fixers work this way.

jmarolf commented 5 years ago

I'm eager to get to know more complex Design Patterns. that would be amazing 💪 🎉

In general we hope what most analyzers/codefixes can be stateless and deterministic. This allows us to use parallelism better as well as moving your stuff out of the Visual Studio process so less memory is consumed. Calling EnableConcurrentExecution essentially tells us that you meet these requirements.

As you've seen we offer some stateful APIs for analyzers so you can stash information such as symbol definitions. Attempting to store state in a different way is generally a way to get hard-to-diagnose bugs.

Because your analyzer may live in a different process then your codefix there are only a few ways the two can communicate. Anything that you place on the Diagnostic type we will serialize and move across processes for you. In-general if more information than this needs to be passed we typically recommend recomputing it on the codefix side.

ghost commented 5 years ago

You can just use '.AdditionalLocations' on the diagnostic. Lots of analyzers/fixers work this way.

😞 really? damn I feel retarded^^ I know that I've seen that more than a half dozen times but aways thought, meh, not what I need... Bummer... AHH now I get it. I thought that was used internally for mulitple symbolLocations within a block or sth... Alright that's why I didn't use it. Thanks ;-)

In general we hope what most analyzers/codefixes can be stateless and deterministic. This allows us to use parallelism better as well as moving your stuff out of the Visual Studio process so less memory is consumed. Calling EnableConcurrentExecution essentially tells us that you meet these requirements.

As you've seen we offer some stateful APIs for analyzers so you can stash information such as symbol definitions. Attempting to store state in a different way is generally a way to get hard-to-diagnose bugs.

Because your analyzer may live in a different process then your codefix there are only a few ways the two can communicate. Anything that you place on the Diagnostic type we will serialize and move across processes for you. In-general if more information than this needs to be passed we typically recommend recomputing it on the codefix side.

Ah, thanks for the Insight. Sounds amazing what you're doing ;-) I was just about tinker a custom CodeFixAllProvider^^ Good you're here ;-) Now I'm even more interested how Roslyn works under the Hood. Is there any "How we did it: Roslyn's Software Architecture Special" Youtube Video out there? would be awesome. 😄 Again thanks for your help guys. I'm doing some quick fixes now ;-)

jmarolf commented 5 years ago

"How we did it: Roslyn's Software Architecture Special" Youtube Video out there?

Josh Varity did a set of youtube videos a while back that is mostly about getting started. Unfortunately I can't point you at something more in depth than that at the moment.

ghost commented 5 years ago

Josh Varity did a set of youtube videos a while back that is mostly about getting started. Unfortunately I can't point you at something more in depth than that at the moment.

Yeah, that and Roslynator got me started. And yeah, Bummer there isn't an InDepth look into the architectural Structure of Roslyns Integration. But maybe I find something when binging it properly ;-)

ghost commented 5 years ago

You know what the biggest bummer of all is: I figured static Inheritance is utter crap for using it dynamically, which was my damn f-ing stupid intention 😅. I really didn't figure that I need an object to cast up the inheritance chain. I'm so freaking, fucking retarded. I wasted the last 2 months on that crap and I simply missed the obvious. And simply because I couldn't accept that it was forbidden and bad practice, and didn't know why. (here goes my Career at Microsoft) So geesh, here you got it. I'm fucking retarded. But hey, maybe I can let people know that (Static Inheritance isn't that usable^^).

Geez, that would be the most embarassing Blog and Stackoverflow Entry ever. HA! Let's do it 😄

Edit: I could cache all subclasses though? Then the question would be how do I invoke up the inheritance tree efficiently, without having to use strings... hmm. Aw crap you cannot cast Delegate to methodgroup, right? damn! (stumbled upon: How neat would it be if I could simply call a method extensible? and have an easy function injection going on when calling Injector. InjectMethods. That would be nice)

You don't happen to have something like Roslyn, but for CodeCompletion? So instead of having to use a Delegate I could map qualified functions as parameters. Like those of an inheritance Tree or methods that are marked by an Attribute to only map to certain calls like I mentioned with the extensible InjectMethod? Guess that would be like a Roslyn for Intellisense? Ah no, that would go too far into compilertech... which is too far optimized to throw in customizable bits... right?

Alright, I figured it wouldn't be a big deal If I could work on an internal implementation that hides the upcast derivedTargetType for a custom created method of the baseMethod/Property. Still everything is whirring around, being unclear, a brainstorm if you say so, but one that acts like a sandstorm hitting you with the lightning of a new Idea and Vision just to lose it again in the gray shroud of unknown. But you're trying to follow it and on the way there are other flashes hitting you, dazzling you with their intriguing ingenuity and you feel like you have to tear yourself in two just to follow both traces. So It really looks like a have to find out how you made System work declaring the keawords. maybe we can find something more dynamic and use an '@' to define custom keywords... ( and that would lead to an unbearable complexity... crap)

CyrusNajmabadi commented 1 year ago

Closing out as answered.