dotnet / csharplang

The official repo for the design of the C# programming language
11.41k stars 1.02k forks source link

Champion "Lambda discard parameters" (VS 16.8, .NET 5) #111

Open gafter opened 7 years ago

gafter commented 7 years ago

Examples: (_, _) => 1, (int _, string _) => 1, void local(int _, int _) ...

See

DavidArno commented 7 years ago

Please, please, please could this apply to methods too? Sometimes, the parameters must be there, but just aren't used. By way of example, this Unit type has to have both R# and CA1801 suppressions for the operator overloads:

public static bool operator ==(Unit u1, Unit u2) => true;
public static bool operator !=(Unit u1, Unit u2) => false;

Being able to write those with discards to reassure those checks that the parameters must be there, but aren't used, would clean up such code nicely:

public static bool operator ==(Unit _, Unit _) => true;
public static bool operator !=(Unit _, Unit _) => false;
jnm2 commented 7 years ago

@DavidArno I believe the .NET Framework design guidelines have always been to use the names left and right specifically. https://msdn.microsoft.com/en-us/library/ms229004.aspx

DavidArno commented 7 years ago

@jnm2,

Oh, I didn't know that. I'll give that a try to see if it makes the compiler happier. Would still like to be able to use discards though.

acple commented 7 years ago

this proposal also permit in nested lambda expression _ => _ => true ?

Enumerable.Range(0, 3)
    .SelectMany(_ => Enumerable.Range(0, 3)
        .SelectMany(_ => Enumerable.Range(0, 3))); // i don't need parameters in nested
gafter commented 7 years ago

@acple Possibly not, unless we also do something like https://github.com/dotnet/roslyn/issues/15793 at the same time.

acple commented 7 years ago

@gafter Thank you. in my opinion, lambda parameter shadowing is a bit dangerous but, discarded parameters are safe to overcover. so that should support only with discarded parameters. is this shadowing issue? or discarding issue?

gafter commented 7 years ago

@acple This is a shadowing issue. When there is only one lambda parameter named _, it is a normal identifier that declares a parameter. It must be so for backward compatibility.

acple commented 7 years ago

@gafter okay I understand. desire the identifier definition rules will be relaxed partially... thanks for the information! :smile:

MovGP0 commented 7 years ago

there should also be a compiler warning/error when using an _ named variable:

items.ForEach(_ => Console.WriteLine(_));
DavidArno commented 7 years ago

@MovGP0,

Unfortunately, that would be a breaking change. An analyzer could be added to this affect though.

jnm2 commented 7 years ago

I think use of a single _-named parameter should be encouraged! I don't intend to use it any less.

DavidArno commented 7 years ago

@jnm2,

I disagree. If I was going to use the parameter in the lambda body, then I'd use eg x. I think _ should only be used when the parameter(s) aren't used in the body, ie as discards.

I appreciate that's personal opinion though.

jnm2 commented 7 years ago

@DavidArno Exactly, it is personal opinion. My rationale is that I'd like to not have to specify a parameter at all, so until I can do .Where(@.Foo != null) or similar, _ => _.Foo seems to be the nearest I can get to not specifying a parameter name, readability-wise.

MovGP0 commented 7 years ago

I disagree that it is a personal opinion, since it is a common convention in functional languages that _ is a 'throwaway' value that is not to be used. It confuses developers who are used to the convention.

Thankfully it is quite simple to give it a proper name.

gafter commented 7 years ago

I disagree that it is a personal opinion...

Oh, the irony.

firelizzard18 commented 6 years ago

Nested discards (_ => _ => <expr>) should be supported on at least an opt-in basis.

OJacot-Descombes commented 6 years ago

Since a underscore is a valid identifier, shouldn't we use another character as discard character? I suggest the minus sign.

(-,-) => or - =>

wanton7 commented 6 years ago

If I remember correctly also nameof was a valid indentifier before it was added. For consistency should be used every for discards. When this is added I would expect a compile error if is accessed when C# target language version is same or greater than version where this was added.

HaloFour commented 6 years ago

@wanton7

If I remember correctly also nameof was a valid indentifier before it was added.

It still is. The nameof identifier only resolves to the keyword when there is no other identifier by that name in use:

static class Program {
    static string nameof(object o) => "tricksy!";

    static void Main() {
        string s = "s";
        Console.WriteLine(nameof(s)); // prints "tricksy!", not "s"
    }
}

Similarly, the C# compiler treats _ as a valid identifier when such an identifier is in scope, but as a "discard" when it's not, except with new syntax that could not be broken by always treated the identifier as a discard:

string s = "123";
{
    int _;
    int.TryParse(s, out _); // _ is a variable
}
{
    int.TryParse(s, out _); // _ is a discard
}
{
    int _;
    int.TryParse(s, out var _); // _ is always a discard
}

This is quite unlike Java where they did fully deprecate _ as a valid identifier in version 9, and they did fully deprecate var as a valid class name in version 10.

The C# team is pretty obsessed with backward compatibility, which is usually a great thing, except (IMHO) here where _ can mean any number of things and an intent to discard might overwrite a variable or field that happens to have an unfortunate name.

firelizzard18 commented 6 years ago

@OJacot-Descombes I would prefer ~ as a new discard keyword: ~ => { }, (~, ~) => {}

wanton7 commented 6 years ago

@firelizzard18 _ is already a discard keyword in C#, adding another keyword that does exactly same but is only used with lambda will make C# harder to read.

firelizzard18 commented 6 years ago

@wanton7 I was responding to a previous comment by @OJacot-Descombes:

Since a underscore is a valid identifier, shouldn't we use another character as discard character? I suggest the minus sign.

I was adding my two cents to that, saying that if the language designers agree with that comment, I think ~ would be better than - (as a universal discard keyword, not as something special for lambdas).

tpetrina commented 6 years ago

But still, (,) => ... is easier. You know what is even easier? Ensuring () => ... also works when you don't care about arguments...

theunrepentantgeek commented 6 years ago

@firelizzard18 Given that _ is already in use as the discard symbol in C# (since v7.0), the chances of a completely different symbol being introduced for discards in a different context are essentially nil.

theunrepentantgeek commented 6 years ago

Ensuring () => ... also works when you don't care about arguments...

This would make the language ambiguous.

void Volley(int i) { ... }
void Fling(Action action) { ... }
void Fling(Action<int> action) { ... }

void Demo 
{
    Fling(()=> Volley(4);
}

If () => included the concept of discards, which overload of Fling() would be called?

Also, given that this is currently unambiguous (calling Fling(Action)), widening the definition of ()=> to include implicit discards would break existing code.

tpetrina commented 6 years ago

In that case, fail with error ambiguous. It's not like we don't have errors in case of ambiguity.

HaloFour commented 6 years ago

@tpetrina

In that case, fail with error ambiguous. It's not like we don't have errors in case of ambiguity.

But not in that case. () already legally means "no parameters". Changing that would break existing code which would make it a non-starter.

tpetrina commented 6 years ago

Then you can use (,,,) => {} syntax. I actually like using_ as an name sometimes.

DavidArno commented 6 years ago

@tpetrina, using _ as a name, rather than a discard, is just plain weird in my view.

I still find it disappointing that the LDM team won't follow Java's lead here and properly deprecate _ as a valid identifier. Sure it'll break some folk's code, but a simple code fix that replaced them with @_ would quickly patch fix such code until folk had time to properly fix those names.

tpetrina commented 6 years ago

@DavidArno Lots of stuff are weird in this world, like DateTime formatting in US and people still do it. For example, I would prefer:

public static bool operator ==(Unit, Unit) => true;
public static bool operator !=(Unit, Unit) => false;
jnm2 commented 6 years ago

@tpetrina When would those operators ever be useful? They only come into play when comparing two instances which are statically known to be Unit, so the the code that would have used == or != should just remove the check. (Assuming you define Unit as an empty struct so that it doesn't have equality operators at all.)

tpetrina commented 6 years ago

That is not my code, see second post in the thread.

DavidArno commented 6 years ago

@jnm2, see my comment above. It's genuine code. Of course, whether they are needed as you say, is a whole different question. With v8 and nullable ref types, null will be able to replace unit in so many cases that the type itself might become unnecessary.

jnm2 commented 6 years ago

@tpetrina My bad! @DavidArno the fact that it's returning a constant means that the programmer should know that anyway; since operators are not virtual, that means the programmer already knows at the point of usage that both operands are Units. IEquatable<Unit> is still super important though.

OJacot-Descombes commented 6 years ago

It escaped me that _ was already a discard in some cases (C#7.0). In this case VS should provide a code fix when _ is used as identifier, maybe even a compiler warning, but in any case apply the keyword syntax coloring for discards.

DavidArno commented 6 years ago

@OJacot-Descombes, the language team are really reluctant to even make it a warning. Good practice says "treat warnings as errors". So when they introduce a new warning, that becomes a breaking change for some. The team really hate breaking changes. They set the justification bar very high for letting such changes through.

I suspect at some stage they may accrue enough evidence that _ as a discard and a legal identifier is causing sufficient confusion to justify it (as they did with changing closure behaviour with foreach loops) , but we aren't there yet, I don't think.

CyrusNajmabadi commented 6 years ago

(as they did with changing closure behaviour with foreach loops) ,

I think those are very different scenarios. The loop case is one where the language was not just behaving in a counterintuitive fashion, it was actively leading people to write code that looked correct but could be very wrong depending on the circumstance.

The same doesn't really apply for _. It's normally a discard, except in the case where you don't want it to be a discard because you're using it as a variable. This duality doesn't really cause a problem with code. For exapmle, if the code treats it as a discard that doesn't really break your code (since you weren't using it as a variable). And if you were using it as a variable, preserving htat logic doesn't break people who want it to be a discard, etc. etc.

'foreach' was a very special case because of how actively broken reasonably looking code was. Discard-confusion (if that really even amounts to anything) is practically nothing like that.

CyrusNajmabadi commented 6 years ago

I still find it disappointing that the LDM team won't follow Java's lead here and properly deprecate _ as a valid identifier. Sure it'll break some folk's code,

You've now broken people's code. For what benefit? What is the actual end value that is sufficiently positive to have people go through the pain of the break? Why not have both worlds? Provide the benefit, and not break people?

DavidArno commented 6 years ago

You've now broken people's code. For what benefit?

Clarity. And clarity wins every time in my book.

HaloFour commented 6 years ago

@CyrusNajmabadi

You've now broken people's code. For what benefit? What is the actual end value that is sufficiently positive to have people go through the pain of the break? Why not have both worlds? Provide the benefit, and not break people?

To resolve ambiguity. Even if it's uncommon for _ to be a field somewhere it's perfectly legal and now the language has a built-in trap to accidentally overwriting said field when someone wants to opt-in to using discards, made worse by the fact that the compiler must favor the previous behavior in order to preserve compatibility. The compiler may know for a fact that there is no _ in scope, but the developer doesn't without going out of their way, and the tooling doesn't even help as Visual Studio displays both in black by default.

I'm sorry, Java's approach was significantly better here. Deprecate the identifier over two versions and finally remove entirely. That's so much better than overloading two diametrically opposed concepts into a single character and trying to force that into the language.

CyrusNajmabadi commented 6 years ago

Clearly that's not hte case for many customers. Why should those views win out over their prefernce to not be broken unnecessarily? Note: this is not hypothetical. Breaks happen all the time, and we do hear about it directly from people that are not at all happy that their code which has been working fine is now considered broken.

CyrusNajmabadi commented 6 years ago

and the tooling doesn't even help as Visual Studio displays both in black by default.

Tooling is fixable. There's no need to break people if we can improve things on the tooling front.

HaloFour commented 6 years ago

@CyrusNajmabadi

Clearly that's not hte case for many customers. Why should those views win out over their prefernce to not be broken unnecessarily?

Break the few to benefit the many. Java's marketshare didn't take a dive when they did this. C# would've been just fine. The handful of projects using this would be updated and everyone would've moved on.

CyrusNajmabadi commented 6 years ago

I'm sorry, Java's approach was significantly better here. Deprecate the identifier over two versions and finally remove entirely. That's so much better than overloading two diametrically opposed concepts into a single character and trying to force that into the language.

I disagree. But a bunch of that comes from using systems that have broken me over and over again, and just getting fatigued from it all. It's enogh pain just dealing with APIs getting obsoleted. Upgrading something and having to go fix up a ton of code just sucks. Note: this also applies when there are 'fixes', as they don't always work properly and you have to go and examine and handle so many cases to deal with the fallout.

HaloFour commented 6 years ago

@CyrusNajmabadi

I disagree. But a bunch of that comes from using systems that have broken me over and over again, and just getting fatigued from it all.

This is why I'm comparing it to Java, not Swift. 😉

Java 10 just made it illegal to use var as a class name. Nobody is complaining.

jcouv commented 6 years ago

Filed https://github.com/dotnet/roslyn/issues/26594 for an analyzer that warns if you use _ as an identifier.

CyrusNajmabadi commented 6 years ago

Java's marketshare didn't take a dive when they did this.

There are far to many factors to be able to judge any sort of causal relationship here. For all we know they might have grown more and this was a limiting factor.

I'm not saying that this could not be done. I'm saying that the benefits seem absolutely miniscule and will just end up being a pain for people.

I mean, it's not like we don't have ample historical precedent here. We added 'var' and 'nameof' and didn't deprecate, and the world has gone on just fine, despite the fact that some people do use 'var' as an actual type name. The need to deprecate has not actually arisen, and it seems to only be language enthusiasts who seem to actually have a problem here. In practice the issues appear non-existent. Whereas, when breaks happen, there is enormous pain felt and we get that feedback straight on from large sectoins of customers hit by it.

CyrusNajmabadi commented 6 years ago

This is why I'm comparing it to Java, not Swift.

TypeScript is another example. I've broken on nearly every update of that language. (Haven't tried 2.8 yet, but i expect the same). IMO, that's more ok for that language given their target audience, and the acceptance of breaks/churns/new-shiny every month or so. C# and .Net target very different communities, including large organizations that are slower to adopt and are not as willing to take this sort of pain upon adoption.

HaloFour commented 6 years ago

@jcouv

Might I suggest also adding it to warning waves, if that's ever to happen?

@CyrusNajmabadi

I mean, it's not like we don't have ample historical precedent here.

Definitely, and don't get me wrong, I generally applaud the backcompat efforts by the team. In the majority of the cases if the compiler confuses an identifier with a contextual keyword the result is broken compilation. I'm sure that someone could come up with pathological examples where this isn't the case, but they're probably exceptionally rare. But in this very specific case mixing the two results in accidentally overwriting a variable or field unintentionally. This is why I lobbied against using _ as a discard/wildcard to begin with.

CyrusNajmabadi commented 6 years ago

or field unintentionally

I would definitely love a real world example of someone actually having a mutable field called '_', and then intending to have a discard, but having that discard write into the field...

--

It seems like that case would be much less common or likely than jsut hte case of people using _ today and having that code now break.