dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.11k stars 4.04k forks source link

prefer var not working as expected #29657

Open vsfeedback opened 6 years ago

vsfeedback commented 6 years ago

when setting "When variable type is apparent" to "Prefer var" then using the FormatDocument functionality, the .cs file with the following line:

List<string> listOfStrings = myString.Split(',').ToList() 

Is formatted/rewritten to:

var listOfStrings = myString.Split(',').ToList()

I would expect it to leave the type definition and not replace List<string> with var.

I have used stylecop and resharper on and off over the years, and the general rule has always been if the type is somewhere to the right of the equals, use var (meaning it is apparent). If the type is not listed, use the explicit type.

If you guys change the way apparent is defined it is going to screw up our whole codebase. I love that you are adding functionality in VisualStudio, but please do not change this definition. Is this a bug or did someone figure if Linq is used and even if the type isn't on the line it is still apparent? If the latter is the case, one could argue for everything being a var.

This issue has been moved from https://developercommunity.visualstudio.com/content/problem/322157/prefer-var-not-working-as-expected.html VSTS ticketId: 672386 These are the original issue comments: (no comments) These are the original issue solutions: (no solutions)

CyrusNajmabadi commented 5 years ago

Without getting into the individual rules, I can definitely see why some people disagree that unknownType.ToList() is the same as unknownType.ToList()

Yes. This is something that would now have a split between 'apparent' and 'explicit'.

NoneGiven commented 4 years ago

This feature really needs a revisit. You can see it's been requested numerous times, and if you want a big example of why the current options aren't good enough, the dotnet runtime repo (among others) enforces a code style policy that var is only allowed for new or explicit casting -- no ToList or similar -- which can't be enforced through VS.

You could enforce it through a custom analyzer, but it wouldn't participate in Ctrl+K, Ctrl+E or other code generating actions.

dionito commented 3 years ago

ToList is considered a 'conversion' method. Because it's "To" + Type-Name. So if you have methods like ToDateTime it's apparent that that's it's producing a 'DateTime'.

In this case there is 'ToList' which is producing a 'List'. So the type is apparent and 'var' is used.

--

This behavior was present when the feature was added here: https://github.com/dotnet/roslyn/blob/a534500c8f7d473fbfdb72e4ede133ae99c55af8/src/Features/CSharp/Portable/Diagnostics/Analyzers/CSharpTypingStyleDiagnosticAnalyzerBase.State.cs#L191

--

I have used stylecop and resharper on and off over the years, and the general rule has always been if the type is somewhere to the right of the equals, use var (meaning it is apparent). If the type is not listed, use the explicit type.

In this case, the type is considered to be on the right of the equals (because it's in 'ToList').

How is ToList() apparent?

var x = Foo-bar().ToList();

Is it a list of strings or integers? Not apparent, at all.

The rule, however, does not consider apparent something like: Document doc = CreateDocument<Document>(foo-bar);

where the return type and the generic are the same. IMO this is way more apparent that the ToList() case.

Cheers, Dioni

CyrusNajmabadi commented 3 years ago

@dionito See teh discussion that follows, the reasoning is provided. This was something we examined by actually bringing up these cases to people and asking if they thought the type was apparent from context. In the vast majority of cases, people felt it was apparent, so we wrote the rule accordingly.

idg10 commented 3 years ago

This:

the vast majority of cases, people felt it was apparent

is not as conclusive as you might think. You might have fallen into a statistical trap. Let me make up some data that fits your description, and yet which points in the exact opposite direction from your conclusion.

When you asked those questions, did you record which of the answers came from people who prefer var in the first place, and which came from people who think var obscures things? E.g. might it have looked like this?

preference Apparent Not apparent
var 1000 20
explicit type 1 100

In this (completely made up) poll, 1001 of the 1121 people asked said they thought the type was apparent—that's 89%, "the vast majority" as you put it.

However, of the people who said they thought that it was apparent, 99.9% of them have absolutely no interest in using this feature. It seems unhelpful to allow their opinion to sway the design of a feature they're not going to use.

If you just look at the people who would actually want to use this feature in this (completely made up) poll, you find that 99% of those people (the ones who will actually use this feature, and care about how it works) think it's not obvious.

So in summary both of these (seemingly contradicting) points are true:

Again, made up data. But not implausible. If you're in the category of people who prefer the nominal type to be explicit, then you obviously have a different point of view on how much type names tell you from someone who's perfectly happy using var everywhere, and so you are highly likely to have a different view on whether the type is apparent in these cases. My point is that your observation, that you examined this "by actually bringing up these cases to people and asking if they thought the type was apparent from context" could well have pointed you in entirely the wrong direction. If you don't have a breakdown of the kind above, then you can't really conclude anything about how the feature should work from the fact that most of the people you asked felt the type was apparent. You need to know what the majority of people who might use the feature think.

And the fact that people who like var everywhere substantially outnumber those who don't means that looking at the answer to this question in aggregate is going to drown out the views of the explicit-type-preferers with the views of the var-preferers, which in the case of a feature designed for var-preferers, is exactly the wrong thing to do.

(If you do have that breakdown and can demonstrate that even amongst just those of us who prefer the full type name to be visible somewhere, the "vast majority" think the type is apparent, then I'll be utterly amazed but will have to submit to the data, although I'll find it hard to shake off the suspicion that you might not actually have shown anyone an example that really illustrates the problem when asking the question.)

BTW, the code that drove me here today was this:

var successResponses = operation
    .Responses
    .Where(r => r.Key == "default" || (r.Key.Length == 3 && r.Key[0] == '2'))
    .ToList();

While it may be apparent that successResponses is a List<...something...>, I don't believe you will be able to tell me what that ...something... is from looking at this code alone. I don't believe it is apparent. (And even if you scan up the code to discover that operation is an OpenApiOperation, and you can thank me for making that possible by not using var where that variable was declared, I submit that unless you happen to be extremely familiar with the library I happen to be using, it's still not remotely apparent what the item type is in this list.)

I fully understand that you can't really change the existing behaviour. But please please please I would love some sort of way to switch on a feature that prevents var from being recommended in cases where there are non-apparent generic type arguments.

CyrusNajmabadi commented 3 years ago

which in the case of a feature designed for var-preferers,

Var was not preferred when we were investigating this. It was the opposite. There was a lot of anti-var sentiment, and this investigation was into the rules that the anti-var people felt would make it somewhat acceptable to allow for some subset of var within a code base.

CyrusNajmabadi commented 3 years ago

@idg10 see: https://github.com/dotnet/roslyn/issues/29657#issuecomment-447266179

Eirenarch commented 3 years ago

I guess I'll simply have to adjust my style to Type identifier = new() and enforce a ban on var. Too bad this means migrating codebases. This issue really frustrates me, it seems like such a minor addition to provide this option. It is not like we are asking for a language change

CyrusNajmabadi commented 3 years ago

@Eirenarch see https://github.com/dotnet/roslyn/issues/29657#issuecomment-447266179

it seems like such a minor addition to provide this option

Please see the contents of the thread. It's not minor, and there are disagreements even among participants here about how some of these cases should behave. Indeed, looking through it, your personal views on how this feature should work were in conflict with how another poster here thought some examples should work. You even described yourself here as "Yeah I know people disagree but I am an extremist". I'm very wary about updating for views that are described as 'extreme' and which are in conflict with other views from the community on the same topic.

I hope this helps show how this is not a 'minor addition' here.

That said, as i mentioned in the linked comment, I'd be happy to work with people to get some changes in here. This is currently on our backlog, so it would likely need community contribution to move forward.

toddlucas commented 3 years ago

There is a difference between these three cases:

var foo1 = bar.ToString();
var foo2 = bar.ToList();
var foo3 = bar.ToNullable();

In the first case, the type is apparent in the conversion method name. In the latter two cases, the conversion method only identifies the container type. It's interesting to know what type the container is, but I would argue that it's of secondary importance. The greater half of the type information is hidden. The fact that the hidden type happens to be a generic type parameter is an implementation detail. If a dedicated method like ToStringList where used, everyone would be happy. This whole argument seems to revolve around the concept that the container type is the only relevant type information, as if the distinction between generic types and non-generic types is irrelevant. This distinction seems to be fundamental in the way types are treated in much of the rest of the compiler and tooling, why not here? It's clearly a problem for a number of people. A feature flag (e.g., in .editorconfig or as an analyzer rule) seems like an obvious solution.

CyrusNajmabadi commented 3 years ago

@toddlucas please see https://github.com/dotnet/roslyn/issues/29657#issuecomment-447266179 and https://github.com/dotnet/roslyn/issues/29657#issuecomment-841840758

CyrusNajmabadi commented 3 years ago

why not here?

Please see a lot of the thread. Even in this very thread with people askign for different behavior, there were disagreements on this from the community.

A feature flag (e.g., in .editorconfig or as an analyzer rule) seems like an obvious solution.

Sure. But there are open questions about how that rule should work. As mentioned, there have been examples provided where different community members had differing opinions on how the rule should work. As such, it's likely that just adding such an option with bifurcate things further, or leave some group feeling like their individual style is not being honored.

toddlucas commented 3 years ago

Yes, that makes sense. Thanks for the links. If this were to be community driven, what is the best approach given that it's on the backlog? The Needs Design Review label suggests that someone should make a proposal? Or could that be done as part of a proof of concept PR?

mtrtm commented 3 years ago

Looking at the comment outlining functionality, it seems everyone in the thread agrees with 1-4 as the actual type is on the line and I don't see anyone disagreeing with that.

I tried to read through everything here, and feel like everyone would be happy with:

"Use var only when type is present on line" "Use var only when type is present on line or in method name", provided a generic cellection would be addressed (e.g. the option would NOT change to var when the line List<int> myvar = col.ToList())

Am I missing something? Does anyone disagree, or think there needs to be more options?

xecrets commented 3 years ago

The rule #6, ....ToFoo(...). i.e. Convert.ToDateTime, blah.ToList(), seems to me to be in direct violation of https://docs.microsoft.com/en-us/dotnet/csharp/fundamentals/coding-style/coding-conventions "Don't assume the type is clear from a method name.".

It seems strange to me to have a style analyzer with a default, non-configurable rule that is in direct violation of the documented conventions.

Also, regardless of the previous user poll - it just doesn't make sense with generics to assume that the type is clear from

var foo = MyMethodOrWhatEverExpression.ToList(); // We're assuming the type is clear from the method name and that it's non-generic.

In actual fact, foo is of type List<Tuple<string, int>> for example. How is that clear!?

In view of this, either update your documented conventions to state "DO assume the type is clear from a method name" AND limit it to non-generic types, or scratch rule #6.

CyrusNajmabadi commented 3 years ago

In view of this, either update your documented conventions to state "DO assume the type is clear from a method name" AND limit it to non-generic types, or scratch rule #6.

Hi @xecrets , the documentation is owned by another team. You can file a suggestion on that by scrolling to the bottom of the limited page and using the feedback links. Thanks!

xecrets commented 3 years ago

In view of this, either update your documented conventions to state "DO assume the type is clear from a method name" AND limit it to non-generic types, or scratch rule #6.

Hi @xecrets , the documentation is owned by another team. You can file a suggestion on that by scrolling to the bottom of the limited page and using the feedback links. Thanks!

Hi @CyrusNajmabadi , thanks for the quick response.

But... well... I was more trying to say that as one company (Microsoft) you should be consistent. I do not agree with rule #6. I agree with "Don't assume the type is clear from a method name.". But if rule #6 is to be kept, I think that all concerned teams within Microsoft should agree with it, especially the ones formulating the official best practices!

However, I did file a suggestion to the documentation team, but it was to contact the analyzer team ( i.e. you I guess ;-) ) and ensure that you are in sync.

CyrusNajmabadi commented 3 years ago

However, I did file a suggestion to the documentation team, but it was to contact the analyzer team ( i.e. you I guess ;-) ) and ensure that you are in sync.

Sounds good. Thanks!

CyrusNajmabadi commented 3 years ago

Imo, I think it is safe to assume. For example ImmutableArray.Creat<int>() does create an ImmutableArray<int>. If your api implies it creates one thing, but actually creates another, then that's going to be confusing no matter what.

The 'apparent' rule is intended to deal with the majority of the ecosystem and the bcl (both of which seem to be well behaved in this respect).

xecrets commented 3 years ago

I'm fine with the idea of the apparent rule, the problem is twofold here. One is that code names frequently lies, often as a result of refactoring. It's just human to miss such details. Two is that it's just not apparent by reading the text what the type of 'coolStuffs' is here, if in fact it's not a List, but a List.:

var coolStuffs = _myService.GetAllStuff().Where(s => s.IsCool).ToList();

I would be ok with rule #6 if it suggested to use var on:

var coolStufs = _myService.GetAllStuff().Where(s => s.IsCool).ToList<Stuff>();

The problem is that it collides with the simplify name rule, which will convert this to the above - and the clarity and apparentness goes away :-( .

It is a bit of a mess unfortunately, and I understand that it's not easy to please all.

But saying that the type is apparent from the text ".ToList()" when the return value of the method actually is a List<T> and not a List is just not true. It's not apparent. You can't tell me or even guess what the type is from just seeing ".ToList()". This is in stark contrast to for example:

var stuff = new Stuff();

Here you can definitively say that the type is apparent from just reading "new Stuff()". It's a Stuff. Nothing else, by definition.

xecrets commented 3 years ago

Imo, I think it is safe to assume. For example ImmutableArray.Creat<int>() does create an ImmutableArray<int>.

The problem with this example is that that's not what the reader of the code will see. They will see: ImmutableArray.Create() - and now it's no longer apparent that it creates an ImmutableArray<int>. This exactly analogous to the ToList<Stuff>() example. If it was actually seen as that, that'd be fine. But following the other suggestions of Visual Studio and the analyzers it will be simplified to ToList(). And there's the problem.

CyrusNajmabadi commented 3 years ago

One is that code names frequently lies

Note that this is not a concern here. Our analyzer verifies the code is not lying. So if you have ToDateTime, but you return a TimeSpan, it won't trigger.

CyrusNajmabadi commented 3 years ago

The problem with this example is that that's not what the Reader of the code will see.

Yes it is :-). My example was intentional.

They will see: ImmutableArray.Create()

This would not compile. The language itself doesn't know what that means :-)

xecrets commented 3 years ago

One is that code names frequently lies

Note that this is not a concern here. Our analyzer verifies the code is not lying. So if you have To datetime, but you return a TimeSpan, it won't trigger.

Good point, but if the faulty refactoring occurs after the change to:

var date = Something.ToDateTime() // Oops - someone changed ToDateTime to return a TimeSpan later

will the opposite rule kick in (if configured) and suggest to use an explicit type?

xecrets commented 3 years ago

They will see: ImmutableArray.Create()

This would not compile. The language itself doesn't know what that means :-)

My bad, you're of course right. This relates to rule #5, Foo.StaticMethodThatReturnsAFoo(...). i.e. XElement.Parse . Here the generic argument is required, and cant' be simplified away. Still don't really like it, since the rule will fire on for example:

MyClass<int> x = MyClass<int>.GetSomeStrings();

suggesting it's better with:

var x = MyClass<int>.GetSomeStrings(); // Here as a human I'd probably think I got some strings

The example is a bit contrived, agreed, but it still illustrates the risk of code or humans assuming things from method names. The method name 'GetSomeStrings' is really confusing here, and the problem is that the rule Foo.StaticMethodThatReturnsAFoo() doesn't really know what the method name implies for a human. If the method is a classic factory method, i.e. Create(), yes it does work - but it still is about trusting method names to imply the right thing for humans reading them.

This rule, #5, I can live with. The situations where it goes wrong, the code is pretty much messed up anyway, so fine, sort of.

But as for rule #6, with generics, I just don't see that the generic argument T to a List<T> is considered irrelevant. It's after all the major part of the type - it's not that it's a List, because it's not. It's a List<T>, and the type of T is not apparent from the code text reading .ToList(); without the type argument.

CyrusNajmabadi commented 3 years ago

will the opposite rule kick in (if configured) and suggest to use an explicit type?

Yes. :-)

CyrusNajmabadi commented 3 years ago

The example is a bit contrived, agreed

That's the core issue. And why, when queried, so many felt that things like XElement.Parse should be apparent. In practice they wanted the feature to align with the ecosystem, which appears to be well behaved.

And, importantly, we've shipped this way a looooong time. If XElement.Parse now required users to supply an explicit type, it was be a significant problem to many users.

Eirenarch commented 3 years ago

Honestly if you just fix ToList, ToDictionary and ToHashSet I will shut up about this forever :)

CyrusNajmabadi commented 3 years ago

Since the proposal for csharp_style_var_when_type_is_explicit was rejected, i'd like to take another option to teh IDE design team on how we can address this space. Specifically, we will continue to have csharp_style_var_when_type_is_apparent, however, we will augment that option with sub-options like so:

csharp_style_var_when_type_is_apparent_for_default_expression

In other words, we would have csharp_style_var_when_type_is_apparent_for_XXX where XXX would be a knob for each sub option that could be controlled. We could then doc these (and their logic) and allow users to opt out of cases they do not like. This would also give us a mechanism to add in new apparent cases in the future without it being very problematic for users. For example, with the new lambda work we're doing for C# 10, we could say that lambdas with explicit parameter and return types have an apparent type. e.g. string (int) => ... has the apparent type Func<int, string> since the int and string are explicitly spelled out. However, if someone doesn't like that (because Func was not explicitly stated), then they could disable this.

This would also address the problem we have when we've been trying to come up with a set of rules that people can agree on as even in this thread (and other linked threads) there have been disagreements from the community on what tehy think is apparent.

If we went this route, the set of options would be like so (remember that this is the XXX in csharp_style_var_when_type_is_apparent_for_XXX):

name example notes
default_expression default(string)
literal_expression 0 We could also have an option for each literal type (bool/numeric/string/etc.)
object_creation_expression new List<int> not for implicit creation new()
array_creation_expression new int[0] not sure about implicit creation new[] {...}. If all elements were apparent would the array type be apparent?
cast_expression (int)...
as_expression ... as int
is_expression ... is int always apparent this is bool
tuple_expression (new List<int> ..., (int)...) only apparent if all element types are apparent
static_factory_method XElement.Parse Only allowed for static method that returns exact same type as containing type, and containing type is specified
conversion_method .ToDateTime() name_only or name_and_type_arguments. name_only means just the name of the type being returned needs to match the name after To. name_and_type_arguments means you'd need something like .ToList<int> to match a return type of List<int>.

As above, with this approach, we could also add more items in the future depending on how we and the community feel. For example, we've had requests to match things like var v = componentModel.GetService<IFileWatcher>(). Here we could add a new option like instance_factory_method (or whatever) and then add opt-out/in support for these cases.

For a lot of the complaints we have we could then either have those users switch the setting to false. Or, for cases like conversion_method they could specify name_and_type_arguments to deal with the generics case.

I'm hoping this is a best of all worlds approach. It does not introduce a totally new option with a disparate overlap with the existing option. It doesn't change the existing option for codebases that use it. It provides flexibility for codebases out there that don't like our defaults. And it gives us an upgrade path to allow us to add stuff in the future safely with an out for customers who may not like those decisions.

xecrets commented 3 years ago

Very nice! Normally I'd be a bit wary of approving of such fine-grained control, since it introduces a level of complexity.

In this case, as the use of var approaches the level of curly brace-placement as far as variations and strength of opinion, I think it may indeed be the way to go.

Notably I think the use of var is one the more engaging issues in the style department at this time and this suggestion could actually make (almost) everyone happy!

Eirenarch commented 3 years ago

I love this.

fepettersen commented 3 years ago

Probably very late to the ball here, but I still think that any feedback will be necessary on this topic. My entire solution lit up like a Christmas tree as a result of this! I tend to never use explicit types, so my opinions might be on the extreme side, but the line chosen here seems WAY to strict.

Eirenarch commented 3 years ago

@fepettersen how does your solution lit up because of this? As far as I know this is not implemented yet or even decided to be implemented.

fepettersen commented 3 years ago

@Eirenarch After I upgraded to VS 16.11.2 I was suddenly seeing suggestions for using explicit types on basically all my variable declarations. If I'm commenting on the wrong issue I'm sorry, but something has definitely changed

CyrusNajmabadi commented 3 years ago

Nothing changed here.

Eirenarch commented 3 years ago

@fepettersen your settings are messed up, nothing to do with this issue. Also consider setting these things in editor config for your project