Closed gafter closed 6 years ago
I use .Select(_ => _.Foo)
everywhere, and I want to continue doing so.
Relevant previous discussions/arguments/rants:
https://github.com/dotnet/roslyn/issues/14794 https://github.com/dotnet/roslyn/issues/14862
There might be a few more.
I think it's worth the discussion although it would probably be difficult to resolve without breaking a lot of existing code, particular like @jnm's example.
π Perhaps the rules for identifiers could allow underbar for a lambda parameter in expression lambdas but not elsewhere? That would be weird.
At the very least perhaps consider adding a warning wave to declaring a field or local using the underbar identifier. You could differentiate the discard/local case by whether or not the local is ever used.
If there was an auto-upgrade conversion of existing usages of _
as an identifier to @_
, it might not break so much existing code.
The point of _ => _.Foo
is that _
is the least possible visual clutter. I'm not a fan of @_ => @_.Foo
.
@jnm2 Agreed, it would just keep from breaking existing code.
@jnm2 Well, you can always use something like x
, however, I understand why you would dislike it.
p.s. iirc @HaloFour made a proposal that addresses exactly this or something similar where we can do something like @.Foo
or .Foo
(can't remember) instead of _ => _.Foo
.
I'd stop using _ => _.Foo
in a heartbeat if I could use @.Foo
.
@jnm2 Indeed, it would be awesome if they would at one hand get rid of _
as a valid identifier and on another hand would allow people to do something like @.Foo
.
I'd support any kind of deprecation at this point. This seemed impossible to me but now I can see the light. Please deprecate more.
Is this seriously being considered or is this just a wish? Has the LDC suddenly had a change of heart where they're willing to take breaking changes?
What does "deprecate" mean in this context? I assume it wouldn't be a full removal from the language?
@yaakov-h I assume that in any place where today you can use a single _
as a valid identifier it wouldn't be possible with this change.
At the very least perhaps consider adding a warning wave to declaring a field or local using the underbar identifier.
I recall that something like this was already planned to be the case. And the whole process would likely be spread out over a few major language versions, regardless. (So for example, 8.0: Info, 9.0: Warning (maybe with exclusion from /warningsAsErrors
), 10.0: Error. But that's just spit-balling.)
Stumbled on this by accident and just had to add my 2c. I'd be very happy if _
was treated as discard consistently in C#. We're in a weird place right now where it's sometimes discard, sometimes not. I think that was a mistake, but one better fixed late than never.
As someone who makes heavy use of Rx, being able to discard with _
in lambdas would be π―. Random example of one place this would be useful:
this.WhenAnyValue(
x => x.DepartureLocation,
x => x.ArrivalLocation,
x => x.JourneyMode,
x => x.JourneyTime,
x => x.ShowAll,
(_, __, ___, ____, _____) => Unit.Default)
Having said that, I understand concern from @jnm2 et al regarding breaking changes. Personally, I think _
(and x
, and @
) are all terrible names for lambda parameters when a more descriptive name adds value (not the case above because the x
there is just a stand-in for this
). For example, this is much better:
items.Select(item => ...)
But of course, now we're getting into style arguments...
There's no reason we couldn't have this, and there's even a proposal on it:
this.WhenAnyValue(
_ => _.DepartureLocation,
_ => _.ArrivalLocation,
_ => _.JourneyMode,
_ => _.JourneyTime,
_ => _.ShowAll,
(_, _, _, _, _) => Unit.Default)
items.Select(item => item.Foo)
does not seem more descriptive than items.Select(_ => _.Foo)
.
There's no reason we couldn't have this, and there's even a proposal on it
Er, yes. It's what I'm asking for. But it would require that this proposal be accepted so that _
is not an identifier. Or maybe I'm missing something...?
items.Select(item => item.Foo) does not seem more descriptive than items.Select( => .Foo)
It was a very simple example based on yours. Add a SelectMany
and a Where
, all the while calling all variables (of differing) types _
, then see how easy it is to read :)
There's no reason we couldn't have this
Er, yes. It's what I'm asking for.
The code sample uses _
for both parameter names and discardβ it didn't sound like you were asking for that.
Add a SelectMany and a Where, all the while calling all variables (of differing) types _, then see how easy it is to read :)
Though in fact it's most predominantly used only once in a chain, I do this on a regular basis and like it:
foos.SelectMany(_ => _.Bars).Any(_ => _.Start != null || _.End != null)
The project I have open: 16.9% of the files use _.
, and in those files it occurs 6.59 times per file.
The code sample uses _ for both parameter names and discard
Ah, I overlooked your use of _
for parameter names. That is certainly not what I'm asking for. Thanks for clarifying.
@eyalsk I'd assumed a compiler warning for a while, up until it is outright disallowed on the next major language update (or whenever).
Attention please
Why not make it an option to enable/disable usage of discards all over the place?
like how unsafe is optional. that way it wont be breaking change.
Whilst this is a breaking change, it is only breaking if one uses "treat warnings as errors" (which hopefully we all do; oh to have that as the default in new projects!) and it need only be breaking for as long as it takes to select a "ignore this warning in this project" code fix.
@DarthVella Yeah, probably. :)
@DarthVella, @eyalsk,
Warning in v8, error in v9 seems a nice timeline for phasing out the use of _
as an identifier. I'd certainly not like to see either introduced in a minor release.
@DavidArno Yeah this is what I wrote in a comment to @MkazemAkhgary in his post. ;)
for a very rare case, __
(two _
) wouldn't be a warning. unless they decide to deprecate underbar-only identifiers.
(_, __) // former becomes discard, latter remain variable.
@MkazemAkhgary,
I'd personally like to see all underscore-only identifiers deprecated, even though I know this would break some of my own code (and thus the code of anyone who uses my library). The benefits of clarity would outweigh the small pain of fixing that broken code in my view.
@gafter My previous understanding was it's possible to automatically turn all _
identifiers in a scope into discards when there are two or more of them (a situation which previously resulted in a compiler error). This is a both a non-breaking change and delivers the "discards everywhere" feature. To me, the approach seems like an obvious choice; is there a reason why it didn't work in practice?
:memo: With this behavior in place, an analyzer could be used to deprecate _
as an identifier in code bases - specifically to catch cases where a single _
is not being used as a discard.
:memo: I have some code bases which this change would break. It wasn't commonly used, but I'm loathe to change the affected cases because they are in some of the more stable parts of code I've worked on in the past (approaching zero churn).
If we get .Select(@.Foo)
first, then this is okay. It's kind of harsh to deprecate a feature that is only used because we don't have anything better.
@sharwell's suggestion makes the most sense of any in this thread. Frankly I'm a little surprised at some of the supporters hereβ normally you'd be the first to say to allegations that a language feature is dirty, "If you don't like it, write an analyzer." π
Totally agree with @jnm2 and @sharwell on the matter.
You either don't need a complex identifier, as in items.Where
or you do need it and in that case x
won't be a good name.
I do like the way Scala solves it. Lambdas can be written as items.Where(_.Property == property)
and lambdas with _ => _
aren't used since _
is a wildcard. So, IMHO, having a special syntax for short lambdas would be extremely nice and will play with this proposal very well.
Otherwise, it will break the codebase in both pet and work projects I participate in.
@DavidArno
The benefits of clarity would outweigh the small pain of fixing that broken code in my view.
There is absolutely no benefit of clarity if we replace _
with x
, y
, z
and so on. I've seen such lambdas and they're awful regardless of the symbol used. On the other hand, for simple cases _
isn't worse than x
.
@Vlad-Stryapko @jnm2
Theoretically if C# takes the hardline approach and actually makes _
an illegal identifier they could reuse _
for lambda shorthand in a similar manner to Scala:
var foos = stuff.Select(_.Foo)
.Where(_.Property == property);
@HaloFour Yes, that's something I'd really love to have in C#.
As far as I know, current proposal for shorthand in this repo is a bit different from the way Scala works with them, but I personally find Scala's choice not ideal so we definitely might have something even better in C#.
@HaloFour Do you know of any reason that couldn't be done today, allowed everywhere it is unambiguous?
So even if this proposal results in just a warning wave for now, we could still benefit from lambda shorthand using _
at the same time.
@Vlad-Stryapko
I'm less familiar with how _
works in Scala, but after a brief read I can immediately see some behaviors that would be scary. For example, what do you think the following would do?
int[] numbers = GetSomeInt32Array();
int[] doubled = numbers.Select(_+_).ToArray();
The proposal on this repo is a bit limited in scope, somewhat intentionally. I modeled it more as a shorthand to instance method references from Java, where the type is inferred and it's only really the member that matters. I can see there possibly being some value in supporting more complex expressions but, in my opinion, the goal is primarily to shine focus on the member that is relevant to the transform, not simply provide another lambda syntax that avoids naming things. I don't care for the positional shorthand argument syntax in Swift, for example. It just seems sloppy to me. You're only going to have to write that code once but you're going to have to read and reason about it many times so it seems reasonable to require that the developer devise names to describe the relationship of those arguments.
@bondsbw
Do you know of any reason that couldn't be done today, allowed everywhere it is unambiguous?
Even if it could (and I imagine that it is probably possible), I would dislike the overloading of _
with yet another special meaning where the syntax has overlap with legal syntax today. If _
remained a legal identifier my examples above are immediately ambiguous to the developer who then has to determine if there is a variable named _
in scope.
@HaloFour
For example, what do you think the following would do?
If we had it in C#, I would expect that to mean Select(number => number + number)
. So basically we can refer to the function's argument as many times as we want but we're limited to functions with only one parameter.
With Scala's semantics, that would choose an overload of Select which takes Func<TSource, int, TResult>
and treat the first _
as an element of the sequence and the second _
as its position. I don't think it's obvious for the majority of C# developers so I'd prefer the first version with number => number + number
.
Presumably [using an identifier named
_
] is rare.
I thought it would be good to validate that assumption and since I performed a similar survey before (https://github.com/dotnet/roslyn/issues/14794#issuecomment-258961093), I dusted off my old code and ran the analysis. I looked at 875 recently updated C# GitHub repos with less than 100 stars (to exclude repos like Roslyn) to see how they use underscore as an identifier. The results:
_
as parameter name, and 308 used _
as parameter name, but didn't reference it inside the body of the lambda (so considering it to be a discard would be valid). That leaves 198 lambdas which would broken by this proposal._
as a discard._
.For the last point, a "use" means a use of a distinct symbol in a file. The 28 other uses were:
DEFRA/prsd-iws uses them to ignore arguments in FakeItEasy asserts. 7 uses, e.g. src/EA.Iws.Domain.Tests.Unit/NotificationAssessment/NotificationAssessmentStatusTests.cs, line 92:
A.CallTo(() => progressService.IsComplete(A<Guid>._)).Returns(false);
This is the most concerning "other" use I found, because it's a publicly exposed and documented member in a popular library (more than 1 million downloads on NuGet). For fun, here is its source code:
/// <summary>
/// Gets a constraint that considers any value of an argument as valid.
/// </summary>
/// <remarks>This is a shortcut for the "Ignored"-property.</remarks>
[SuppressMessage("Microsoft.Design", "CA1000:DoNotDeclareStaticMembersOnGenericTypes", Justification = "This is a special case where the type parameter acts as an entry point into the fluent api.")]
[SuppressMessage("Microsoft.Naming", "CA1704:IdentifiersShouldBeSpelledCorrectly", MessageId = "_", Justification = "Some trickery is allowed, isn't it?")]
[SuppressMessage("Microsoft.Naming", "CA1707:IdentifiersShouldNotContainUnderscores", Justification = "But it's kinda cool right?")]
[EditorBrowsable(EditorBrowsableState.Never)]
[CLSCompliant(false)]
#pragma warning disable SA1300 // Element must begin with upper-case letter
public static T _ => Ignored;
#pragma warning restore SA1300 // Element must begin with upper-case letter
RustServers-IO/oxideplugins seems to use it for various string manipulation helper functions. 14 uses, e.g. rust/HuntPlugin.cs, line 802:
sb.Append(_(HMK.Usage, player, _(skill.Usage, player)));
CatLib/Core uses them in tests as for
-loop iteration variables. 4 uses, e.g. src/CatLib.Core.Tests/Support/SortSet/SortSetTests.cs, line 97:
for (var _ = 0; _ < 100; _++)
Erroman/SF seems to use it as the the None
value of an enum
. 1 use in G.cs, line 123:
case ParameterUnit._:
iiwaasnet/kino uses it for a value that's usually discarded, but not always. 1 use in src/kino.Security/SecurityProvider.cs, line 96:
DomainPrivateKey _;
if (!mappings.TryGetValue(messageIdentifier, out _))
{
mappings.Add(messageIdentifier, key);
}
else
{
throw new Exception($"Message {message.Identity} is already mapped to Domain {_.Domain}!");
}
todor-dk/IronSmalltalk has it in its copy of the DLR, effectively as a discard. 1 use in DLR/Microsoft.Dynamic/Generation/CompilerHelpers.cs, line 470:
TypeConverter _;
return TryGetTypeConverter(fromType, toType, out _);
Not to mention that when I'm in pre-C# 7 codebases, I've been using var _ =
to discard values π³
@svick I love your approach. If I may ask, how long would it take to scale that up to cover another order of magnitude or more repos?
@Vlad-Stryapko,
The benefits of clarity would outweigh the small pain of fixing that broken code in my view.
There is absolutely no benefit of clarity if we replace
_
withx
,y
,z
and so on.
Whether there is benefit to replacing with _
with x
etc is orthogonal to what I said. To clarify, if _
were to cease to be a valid identifier, then I feel it sensible to do the same with __
. The two are already easily confused. Having one be a valid identifier and the other not, would just make that confusion worse.
@DavidArno I've understood your initial statement as if making all underscores invalid would benefit the clarity of the code since people would have to replace their broken code with something more meaningful. That's the point I certainly disagree with.
if _ were to cease to be a valid identifier, then I feel it sensible to do the same with __
If we had short syntax for lambdas and _
would act as a discard, I'd personally be fine with whatever happens to __
and other similar identifiers. I think both 'single underscore has a special meaning, multiple underscores are identifiers' and 'single underscore has a special meaning, multiple underscores are not allowed' are fine.
@Vlad-Stryapko yet you provide no convincing reason why multiple underscores are meaningful.
Ask yourself when you choose multiple underscores for variable names
Unless you are drunk you pretty much use it after you use _
first. It makes perfecr sense to deprecate all underscore-only identifiers. That is if c# team already decided to make big change, moving a little higher isnt big deal.
@MkazemAkhgary
you provide no convincing reason why multiple underscores are meaningful
They're already legal and that's why there must be a reason to make them illegal, not vice versa. The reason to make a single one illegal is to reserve it for the special purpose. I can't say I'm 100% sure multiple underscores must be disallowed so I refer to the principle 'don't create more restrictions and change anything unless it's really needed'.
@Vlad-Stryapko by all means, you wouldn't use __
normally unless you have _
somewhere declared. I'm pretty sure _
fans wouldn't go for extra underscore after this limitation, they probably choose x
anyway.
I just want to clarify how bad this feels for me, c# team is already ruining c# language, one that I used to love. now is getting dilapidated with these weird stuff. π
@MkazemAkhgary
c# team is already ruining c# language
Please... let's have some respect and try to keep it professional? just because you disagree with their decisions doesn't mean that they are ruining the language.
one that I used to love.
I have plenty to say about these kind of statements but I'll be gentle and say it wouldn't help your cause.
now is getting dilapidated with these weird stuff.
Sometimes we need to try and see the world through a different lens and then maybe it wouldn't be so weird.
ok, my apologize to who are offended, I try to keep it to my self. you are right, maybe its just matter of time. @eyalsk
@eyalsk I most definitely agree with you, and I personally never write code using _
, so this won't affect me, but I think everyone saw this coming and that it is an unnecessary breakage. This could have been avoided and it's not adding a lot of value vs. using a different character.
@aluanhaddad Yeah but the candidates were few and iirc required more changes to the language so they were pragmatic about it and given the amount of people that are using it, the amount of efforts required to make it work as an ignore character just make sense and let's be honest it looks good. π
Sorry @jnm2. π β€οΈ
As a related aside, I've thrown together a Roslyn analyzer to detect the use of identifiers with discard names. I may have missed some cases. Feedback welcome.
FWIW, StyleCop.Analyzers already disallows _ as a variable name - I had to run through my codebases and remove them when switching from StyleCop to StyleCop Analyzers.
@yaakov-h I believe StyleCop Analyzers makes an exception for _
as a parameter name.
@sharwell Ah, you're right - it was the multiple-underscore ones it didn't like, and I changed single-underscore ones at the same time. https://github.com/DotNetAnalyzers/StyleCopAnalyzers/issues/1606
It is unfortunate that the reader cannot easily distinguish at the use site between a discard and a use of an identifier named
_
. Presumably the latter is rare. Programs would be easier to read if one could reliably depend on_
being a discard.