dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.65k stars 3.97k forks source link

IDE0028/IDE0090/IDE00300/IDE0305 do not catch some expressions #72699

Open bzd3y opened 2 months ago

bzd3y commented 2 months ago

Analyzer

Diagnostic ID: IDE0028: Use collection initializers or expressions

IDE0090: Simplify new expression

IDE0300: Use collection expression for array

IDE0305: Use collection expression for fluent

Analyzer source

SDK: Built-in CA analyzers in .NET 8 SDK or later

Version: SDK 8.0.200

OR

NuGet Package: Microsoft.CodeAnalysis.NetAnalyzers

Version: 8.0.0

Describe the bug

I suppose this might not be considered a bug and could be more of a suggestion. Code like the following does not seem to trigger any of the collection expression/simplification rules.

public void CollectionAnalyzers()
{
    int[] array = new[] { 1, 2, 3 }; // IDE300
    List<int> list = new List<int>() { 1, 2, 3 }; // IDE0028, IDE0090

    List<int> list2 = new[] { 1, 2, 3 }.ToList(); // Should trigger IDE0028 and IDE300.
    List<int> list3 = new List<int> { 1, 2, 3 }.ToList(); // Should trigger IDE0028, IDE0090. Removing `ToList()` causes it to trigger both.
    int[] array2 = new List<int> { 1, 2, 3 }.ToArray(); // Should trigger IDE0300, IDE0090.

    // All 3 of the above lines should/could instead/in addition trigger a new rule; something like "Unnecessary collection copy".
    // The 1st and 3rd line obviously change the type, but there is no reason to be using the intermediate type in the first place, at least
    // in these cases. I suppose some types could have desired semantics, but I question how good that code would be anyway, considering that
    // the reason for doing it would probably not be obvious in most cases other than maybe something like `HashSet<T>`. The rule could "warn"
    // that it "Could change semantics" like IDE0028 sometimes does.

    // If a new rule isn't appropriate then maybe that should be configurable on IDE0305 so that it only triggers for initializations.
    // I don't really prefer the idea of IDE0305, but I WOULD use it if I could configure it to find redundant copying like this.

    List<int> list4 = new List<int>(array); // Should trigger IDE0028. Not passing in `array` causes it to trigger IDE0028.

    List<int> list5;

    list5 = new List<int>(array); // Should trigger IDE0028 and IDE0090. Not passing in `array` causes it to trigger IDE0028, but not IDE0090.
    list5 = new List<int>(array).ToList(); // Should trigger IDE0028, IDE0090 and IDE0305 and/or probably the new rule above. Not passing in `array` causes it to trigger IDE0305.

    object obj = new List<int> { 1, 2, 3 }.ToList(); // Should trigger IDE0305 and/or probably the new rule from above.
}

Steps To Reproduce

A reproduction project can be found here: https://github.com/bzd3y/ReproductionProject . Look at the Analyzers project Error List. It may help to filter the Error List to "Current Project".

Expected behavior

The analyzers should be triggered for the above code. The expected behavior is documented there and in the reproduction project the code came from.

Actual behavior

Not all of the relevant analyzers seem like they are being triggered.

Additional context

One of the things going on seems to be that the ToList() and ToArray() for some reason cover up the other rules. But if IDE0305's severity is set to none or silent then nothing is triggered and these lines are "lost". This happens even when the expression type doesn't change between the initialization, the method call or the assignment. I'm guessing that the analyzer just doesn't go "deeper" into the RHS expression since it isn't a direct initialization in these cases and is "wrapped" in a method call?

But the first assignment to list5 is still a direct initialization, I think, even though it isn't at the point that list5 is declared. In the case of list5 I'm guessing the analyzer just doesn't consider the type of list5 because it isn't being declared there or something?   The code I am maintaining has areas that haven't been touched in 10 years and these rules helped me modernize it and simplify a lot of old code but there were a few instances like some of the above that got missed and I found one serendipitously and then had to find the rest (maybe?) with Find and a regular expression. And yes, the original author actually did things like new List<T>().ToList() for some reason, and as a static readonly that never gets mutated so it probably just should have been an Array to begin with.

In closing, this seems like a combination of a bug in IDE0028/IDE0090 as well as maybe a reason to have either a new "Unnecessary collection copy" rule or a way to configure IDE0305 to only apply to initializations.

bzd3y commented 2 months ago

@mavasani, did you move this because IDE rules belong here and CA rules belong in roslyn-analyzers?

sharwell commented 2 months ago

@bzd3y Yes, IDE rules belong in dotnet/roslyn.

sharwell commented 2 months ago

In regards to these items:

List<int> list2 = new[] { 1, 2, 3 }.ToList(); // Should trigger IDE0028 and IDE300.
List<int> list3 = new List<int> { 1, 2, 3 }.ToList(); // Should trigger IDE0028, IDE0090. Removing `ToList()` causes it to trigger both.
int[] array2 = new List<int> { 1, 2, 3 }.ToArray(); // Should trigger IDE0028, IDE0090.

These seem like a better fit for "Simplify collection initialization" than for "Use collection expression". Each of these can be simplified even without using collection expressions. The code fix for these could vary based on whether collection expressions are supported/preferred by the project.

bzd3y commented 2 months ago

@sharwell okay, thanks. I wasn't clear on that but now it makes sense.

As for "simplify collection initialization", I agree but I tried to stick to what gets triggered without something like a wrapping expression that seems to hide the current rules that are triggered without it.

Which rule specifically are you talking about? IDE0028? Several of them involve simplifying collection initializations.

CyrusNajmabadi commented 2 months ago

that seems to hide the current rules that are triggered without it.

There's no hiding of rules. It's just a question of how far we took the analysis and how many patterns we want to invest in detecting and suggesting updates for.

bzd3y commented 2 months ago

@CyrusNajmabadi I understand that and didn't mean deliberate hiding, but inadvertent hiding in terms of the final result. These are collection initializations that could be simplified, but that gets hidden in cases like them being passed into a method call or apparently if the initialization doesn't occur on the same line as the declaration. I believe this is all the same pattern here, or at least fits in with at least one of the patterns handled by the rules that I listed.

CyrusNajmabadi commented 2 months ago

but inadvertent hiding in terms of the final result.

So, again, the terminology isn't correct here. THere's no hiding (inadvertent or otherwise) :). These systems support the set of cases we've explicitly added support for. If these cases are desired, it's not a matter of unhiding anything. It's a matter of just adding support for the additional patterns suggested :)

I believe this is all the same pattern here

These are all distinct patterns. The analyzer would need to be updated to look for an support these.

CyrusNajmabadi commented 2 months ago

Markign up for grabs. We'd likely take reasonable enhancements here to the patterns detected.

bzd3y commented 2 months ago

@CyrusNajmabadi You are taking "hiding" too literally and misunderstanding my point, as if I am accusing you of something. I'm not sure of the use in arguing the semantics of something being hidden. But, fine, there is no hiding. Nothing gets hidden.

As for this being the same pattern, the pattern I am talking about is the exact same pattern as at least one of the patterns listed. Consider:

List<string> list = new List<string>();
List<string> list2 = new List<string>().ToList();

Those can have the exact same patterns applied to them, except for the second line, which also has a suggestion for rule IDE0305. But only the first line suggests anything other than IDE0305. It suggests IDE0028 and IDE0090. And both of those could be applied in exactly the same way, with exactly the same refactor, to the second line. And if these were arrays instead of List<T> then the first one would suggest IDE300, the array analog of IDE0028, and that could be applied to the second line (as an array) as well.

In short, both involve at least dotnet_style_prefer_collection_expression, but more generally, they all involve ways to simplify collection initializations and also objects, since collections are objects (though IDE0090 doesn't work for arrays). The pattern that I am suggesting is that same general pattern.

If you want to make it a new rule for some reason, then that is fine. The only reason I am "arguing" this is because you seem to be misunderstanding something, so I am trying to clear it up. It seems to me that the fix or enhancement here is that IDE0028, IDE0300 and IDE0090 could trigger off of an initialization that is being wrapped by another expression and beyond the first level of depth.

I haven't worked with analyzers that much, but isn't there a visitor pattern available that would be able to hit these types of expressions at any depth?

sharwell commented 2 months ago

You are taking "hiding" too literally and misunderstanding my point, as if I am accusing you of something

This has less to do with you and more to do with the current project. We use 'hiding' in this solution many places to have a very specific meaning, and users who work on this project are likely to interpret this term in the context of this project even if that is not the intended meaning. To avoid misinterpretation by other users who participate in this issue, Cyrus called this point out and explicitly clarified that this issue does not relate to 'hiding' in the sense that we normally use it.

CyrusNajmabadi commented 2 months ago

If you want to make it a new rule for some reason, then that is fine.

This is not what I'm suggesting.

What I said before is simply:

It's just a question of how far we took the analysis and how many patterns we want to invest in detecting and suggesting updates for.

And

Markign up for grabs. We'd likely take reasonable enhancements here to the patterns detected.

CyrusNajmabadi commented 2 months ago

I haven't worked with analyzers that much, but isn't there a visitor pattern available that would be able to hit these types of expressions at any depth?

Yes. That's exactly what I was saying with:

It's just a question of how far we took the analysis and how many patterns we want to invest in detecting and suggesting updates for

:-)

Much of this seems reasonable. As I mentioned, we would very likely take community contributions here to support more cases.

bzd3y commented 2 months ago

We use 'hiding' in this solution many places to have a very specific meaning, and users who work on this project are likely to interpret this term in the context of this project even if that is not the intended meaning.

@sharwell Okay, I didn't realize that, but it's good to know. Thank you for explaining it.

@CyrusNajmabadi Sorry for the confusion.