dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.02k stars 4.03k forks source link

Proposal: Allow using discard for throwaway variables #8074

Closed dsaf closed 7 years ago

dsaf commented 8 years ago

Related to https://github.com/dotnet/roslyn/issues/20#issuecomment-70366000 and #8049.

Practical use cases:

http://nsubstitute.github.io/help/received-calls (Checking calls to properties)

https://confluence.jetbrains.com/display/ReSharper/Return+value+of+pure+method+is+not+used

Before:

var mode = calculator.Mode;
calculator.Mode = "TEST";

//Check received call to property getter. We need to assign the result to a variable to keep the compiler happy: "Only assignment, call, increment, decrement, and new object expressions can be used as a statement".
var temp1 = calculator.Received().Mode;
var temp2 = calculator.Received().Mode;
var temp3 = calculator.Received().Mode;

After:

var mode = calculator.Mode;
calculator.Mode = "TEST";

//Check received call to property getter
var * = calculator.Received().Mode;
var * = calculator.Received().Mode;
var * = calculator.Received().Mode;
HaloFour commented 8 years ago

I like the idea of being able to use wildcards to discard identifier names in various circumstances but this particular use case seems fairly contrived and extremely narrow. Building a language feature around the design of a specific unit test library does not sound like a good idea, and frankly you have 53 single-character temporary variable names at your quick disposal.

alrz commented 8 years ago

6400: let * = calculator.Received().Mode; though, currently according to the spec draft this is not possible because wildcard is not a complex-pattern.

dsaf commented 8 years ago

@HaloFour

Building a language feature around the design of a specific unit test library does not sound like a good idea

What about my second example - indicating that a pure function result is not used?

...and frankly you have 53 single-character temporary variable names at your quick disposal.

Those 53 single-character names can also be used for "omitting" lambda parameters. After all if a lambda parameter is often being omitted, surely it indicates a problem with specific API?

HaloFour commented 8 years ago

@dsaf

What about my second example - indicating that a pure function result is not used?

I'm not sure that the language should specifically cater to overzealous analyzers either. Between Resharper and FxCop I'd probably still be trying to figure out how to compile HelloWorld.cs.

Those 53 single-character names can also be used for "omitting" lambda parameters.

Fair point, and running through the alphabet isn't all that uncommon today.

After all if a lambda parameter is often being omitted, surely it indicates a problem with specific API?

I'd say that it's not that uncommon for a delegate to accept more parameters than is always required. But invoking a property getter and immediately discarding the result? In my opinion there is no valid situation in where that makes sense in an API.

dsaf commented 8 years ago

Fair enough.

DavidArno commented 8 years ago

I think this request has been prematurely closed. F# has this feature (it uses _) for many good reasons, eg for unused parameters in patterns. @dsaf, could you re-open this please (assuming it can be).

alrz commented 8 years ago

for many good reasons, eg for unused parameters in patterns.

I believe wildcards will be supported in patterns.

DavidArno commented 8 years ago

As well as being of use in patterns, the other main use for this feature in F# is for handling functions that return a tuple, triple etc, when the calling code is not interested in all values.

Using the tuple syntax described in Early View of C# 7 with Mads Torgersen, we may have a method declared as:

public (int x, int y) Compute(int z) => // transform z into x and y

Then, if I only care about x, currently I'd have to do:

var (x, junk) = Compute(someValue);

Instead, using a throw-away variable, we get a much cleaner piece of code:

var (x, _) = Compute(someValue);

or

var (x, *) = Compute(someValue);

Anywhere you'd use dummy, junk, notNeeded etc in any code would benefit from a throw-away variable as it makes it a lot clearer that that variable really isn't used.

alrz commented 8 years ago

that's exactly what #6400 is about.

HaloFour commented 8 years ago

And #20, for lambdas.

button.Click += (*, *) => { Console.WriteLine("Clicked!"); };

I think there is one for out parameters as well.

bool valid = int.TryParse(s, out *);

To note I like the idea of wildcards in many scenarios, I'm just not terribly keen on them for throwaway locals which is what this proposal is specifically about:

var * = SomeFunction();
DavidArno commented 8 years ago

@alrz

6400 is a hugely mixed bag and it's hard to see what it's actually supposed to be about. The outcome (as documented in the pattern spec) seems to the ideas of let being a read-only version of var, something to do with patterns and a keyword in front of (x, y) = ... tuple notation.

It's very unclear therefore (at least to me!) how this covers the idea of throw-away variables.

DavidArno commented 8 years ago

@HaloFour,

Allowing throw-away variables in some situations, but not others seems a recipe for a hugely confusing language extension, especially when the let stuff is taken into account. We could even end up with

let * = SomeFunction();

being valid and

var * =  SomeFunction();

being invalid. Seems an odd decision to me.

HaloFour commented 8 years ago

@DavidArno

Per #6400 that is already the case, let will support pattern deconstruction (including tuple decomposition) but var will not be changed from it's current state.

Also per @alrz 's comment let would not support just a wildcard pattern, so let * = expression; would not be valid. But using a wildcard within a complex pattern would be:

let Point(var x, *) = point;
Console.WriteLine($"The X coordinate is {x}.");
alrz commented 8 years ago

@DavidArno thrown-away variables are not about let or any other constructs, it's a pattern (#206). Unlike F#, C# doesn't need let * = F() either, because it allows you to just ignore the return value; simply call the function F();. As @HaloFour pointed out, ignoring property's return value is a specific scenario in this proposal and doesn't make sense overall.

DavidArno commented 8 years ago

@alrz

Whilst C# does indeed allow the following:

F();

where F is non-void, writing the above code leads to problems. Did I write that because I'm not interested in the return value, or because I didn't realise there was one? In the latter case, the fact that I don't handle it could be a sign of a bug. Good practice suggests therefore that I should make my intentions explicit (as, unsurprisingly, F# enforces) by being able to write:

var _ = F();

Now it's obvious that I'm just disinterested in the return. Due to the beauty of the Roslyn analyzers, I can also enforce this rule, without requiring a breaking change to the compiler.

A further use of this syntax can be seen in my own Succinc<T> project (which provides a means of implementing pattern matching, options and other other union types and the like in the existing C# language). I make extensive use of _ as a throw-away variable. In some cases, this is through necessity, such as when testing properties. Taking an example from UnionT1T2Tests.cs:

[Test]
public void AccessingCase2ForUnionWithT1_GivesMeaningfulException()
{
    var union = new Union<int, string>(2);
    try
    {
        var _ = union.Case2;
        Fail("Expected exception to be thrown");
    }
    catch (InvalidCaseException e)
    {
        AreEqual($"Cannot access union case {Variant.Case2} when case {Variant.Case1} is selected one.",
            e.Message);
    }
}

Here the line var _ = union.Case2; has to contain a throw-away variable as I'm assessing a property and the C# compiler mandates that a read of a property must be assigned to a variable. I then suffer the problem that eg ReSharper, or potentially other analyzers, complain that I'm not using _. If C# 7 supported * as a genuine "I don't care about this value, but I must assign a result somewhere" variable, then this would both enable us that want to explicitly indicate in code we don't care about the return and analyzers to be happy.

So I'd contend you are wrong: var * = ... makes complete sense.

DavidArno commented 8 years ago

@HaloFour ,

Do you have a link to a detailed explanation of let Point(var x, *) = point;, as I can't get my head around what that syntax means?

Thanks.

HaloFour commented 8 years ago

@DavidArno As I mentioned earlier I don't think language features should be designed around how testing libraries happen to function. You already have a perfectly valid form of syntax for accepting the result, and you can continue to use the legal identifier _ to do so if you choose. Beyond that it's a question of preference for style. The c-family of languages has never considered it to be a bad practice to discard the return value of a function and in many cases it is preferable to do so. My code would probably light up like a Christmas tree full of warnings given such an analyzer and I would simply not care to use one.

The proposal for let is covered at the following link, although it doesn't really go into a lot of detail. The #6400 proposal hashes through more syntax but be wary that some of it may have changed: https://github.com/dotnet/roslyn/blob/future/docs/features/patterns.md

In short, you're applying the pattern on the left-hand side of the assignment to the expression on the right. In this case it is taking the variable point, which is of record type Point and it is deconstructing it, assigning the first property to a new variable x and throwing away the second property.

Not a terribly useful example but you can effectively use any complex pattern defined in the spec. To put it into context with the example that you provided:

let (var x, *) = Compute(someValue);

Which takes the tuple returned from Compute, decomposes it extracting the first element into x and discarding the second element.

DavidArno commented 8 years ago

@HaloFour,

I'm really not following the thinking behind your position here.

What I'm arguing for is consistency across the language for any new feature. So, for example, you suggest that * will be able to be used in lambdas:

button.Click += (*, *) => { Console.WriteLine("Clicked!"); };

That begs the question, will I be able to declare an event handler like this:

private void ListChanged(object *, EventArgs *) 
{
    ...
}

when I don't use either parameter? I suspect the answer is "no", which will mean yet another inconsistency.

Currently, my use of _ is far from idiomatic and most certainly isn't perfect. It's actually a fragile feature, for the moment I need to use _ more than once in a single scope, I either need to declare it with object or dynamic, rather than var and then re-use it, or I can ditch _ and switch back to junk1, junk2 etc. By making * consistent across the whole language, this problem would be addressed.

alrz commented 8 years ago

@DavidArno I've been proposed something like ListChanged(object, EventArgs) for methods, delegates and records in #6115 which is exactly what you have in C/C++, I don't know how much it was appreciated (apparently it wasn't).

C-style languages traditionally allow the return value to be implicitly ignored, there's no reason to change that.

The reason is that it'll be a breaking change?

my use of _ is far from idiomatic and most certainly isn't perfect.

Spec for patterns is not at its final stage and might be changed. Similarly, I argue that constant-pattern also should be a part of complex-pattern so that you could write obj is 123 or 456 with OR patterns (#6235) instead of mixing multiple boolean and equality operators. And also is var doesn't feel right in property-pattern. I think it should be changed too. But I'm waiting to see what it will become. :shrugs:

HaloFour commented 8 years ago

@DavidArno

There is something to be said for consistency, sure. However, sometimes it doesn't make sense to apply things in absolutes. For example, you can't use var for parameter types or field types.

Should you be able to use * to ignore officially declared parameters in a method body as you might to ignore lambda arguments? In my opinion, no. It doesn't even matter that the two effectively result in the same thing, it doesn't make much sense to go to the effort of officially declaring a method if you don't actually want to declare all of it. With the lambda you're eschewing declaration of any particular method and the actual method name, metadata and implementation are left up to the compiler. With a formal declaration you are providing more or less the exact metadata to be embedded into the assembly, and parameter names are a part of that.

Also, pertaining to your test case use case, Microsoft's Framework Design Guidelines specifically state to avoid throwing exceptions from property getters.

DavidArno commented 8 years ago

@alrz,

How would allowing the return value of a method be assigned to the throw-away * be a breaking change?

@HaloFour,

"it doesn't make much sense to go to the effort of officially declaring a method if you don't actually want to declare all of it".

If the method is an event handler (or any other delegate-fulfilling method), then it must declare a set of parameters, regardless of whether it uses them. Using * to denote a parameter that exists simply to fulfil a signature contract seems a neat way to communicate this fact.

You make a good point re var though. Your code example:

let (var x, *) = Compute(someValue);

shows yet another worrying inconsistency creeping into the language. What's that var doing there? If I declare a lambda, I either have to specify the types for the parameters, or I just use the parameter name. I wouldn't do (var x, *) => ..., it would be (x, *) => .... The let syntax should therefore follow this convention and so the valid version of your example should be:

let (x, *) = Compute(someValue);
HaloFour commented 8 years ago

@DavidArno

The fact that a method is used to subscribe to an event doesn't change anything else about that method. It is still a formally declared method and parameter names are a part of that. If you want shorthand, potentially with argument omitting, you have lamdba syntax, which works just fine with event handlers.

The let syntax has nothing to do with the appearance of var there, let just decomposes a pattern. The pattern matching spec has included the var keyword to specify the inferred variable pattern. This is most likely to avoid any number of ambiguities, particularly in intent. Afterall, you don't want a typo leading to a variable declaration. The same pattern used in a switch statement:

switch (Compute(someValue()) {
    case (var x, *):
        Console.WriteLine($"X is {x}");
        break;
    default:
        Console.WriteLine("Oops, return value wasn't the tuple we expected.");
        break;
}
DavidArno commented 8 years ago

@HaloFour,

OK, leaving aside * for method parameters, as we will not agree on that one, the mystery var in patterns gets curiouser and curiouser. Your example seems wrong on a number of levels:

  1. I'd expect the syntax to be case (x, *): as, again, the var serves no purpose.
  2. The default here should be reported as unreachable by the compiler as if Compute returns a tuple, then (x, *) matches every possible return value.

I'd expect the following example to be valid, rather than your example, so there seems something wrong with the pattern matching spec if it's not:

switch (Compute(someValue()) 
{
    case (1, y):
        Console.WriteLine($"X was 1, Y was {y}");
        break; 
    case (x, *):
        Console.WriteLine("X wasn't 1");
        break; 
}
HaloFour commented 8 years ago

@DavidArno

I explained the purpose of var there, to declare the intent of declaring a new variable to receive the value. Leaving that as just the identifier would very likely lead to ambiguity. Note that the explicitly typed version would be int x or similar, so in that sense it is consistent with variable declarations in general. However you're free to argue that point in the pattern matching proposal.

You're very likely correct that my example would result in an "unreachable code" warning, I was keeping the sample simple. Your example would probably be a better demonstration of pattern matching in switch, that is if you included the var keyword so that it would compile. :smile:

jveselka commented 8 years ago

@DavidArno Considering your example - imagine you introduce variable "x" in enclosing scope. That would change meaning of the code drastically. I agree that omitting "var" would be nicer, but it is impossible with all the things pattern matching is trying to achieve.

DavidArno commented 8 years ago

@jveselka,

Could you show an example of what you mean please?

jveselka commented 8 years ago

@DavidArno I expected this should work:

int x = 5;
switch (Compute(someValue()) 
{
    case (x, *):
        Console.WriteLine("X is 5");
        break; 
}

But after checking pattern matching spec once more I was surprised it is not allowed. Still my case stands for const fields. If I wrap Your example in a class with const int x, it gets completely different meaning:

class Test
{
    const int x = 5;
    void Method()
    {
        switch (Compute(someValue())
        {
            case (1, y):
            Console.WriteLine($"X was 1, Y was {y}");
                break;
            case (x, *): // matches constant x = 5
            Console.WriteLine("X wasn't 1");
                break;
        }
    }
}
DavidArno commented 8 years ago

I decided to go away and experiment with the features branch, which is the version used in the Outline of C# 7 demo at MVP Summit 2015-11-02 write-up. I created the following code, which contains some long-winded code to replicate an Option<T> type in lieu of discriminated unions/ADTs being added to the experimental compiler, and also contains pattern matching as detailed in that write-up. The code I created, that compiled was:

using System;

namespace CSharp7Tests
{
    public class Optional
    {
        public sealed class NoValue : Optional
        {
            public static readonly NoValue NoValueInstance = new NoValue();
            private NoValue() { }
        }

        public class Option<T> : Optional
        {
            public Option(T value) { Value = value; }

            public T Value { get; }
        }

        public static readonly Optional None = NoValue.NoValueInstance;
        public static Optional Some<T>(T value) => new Option<T>(value);
    }

    class Program
    {
        static void Main(string[] args)
        {
            var x = Optional.Some(1);
            var result = x match(
                case Optional.NoValue _: "None"
                case Optional.Option<int> _x: $"int: {_x.Value}"
            );
            Console.WriteLine(result);
        }
    }
}

The important thing to note here is the line case Optional.NoValue _: "None". I had to express it that way as case Optional.NoValue *: "None" doesn't compile.

In other words, even pattern matching doesn't support the use of "throw away" variables in the manner that _ does in F#. Clearly this is a problem. Pattern matching in C# 7 clearly isn't a good implementation if I must name variables in a pattern, even if I don't use them.

I'd suggest that * in pattern matching is ill-thought-out and default would be a better choice of term for the wildcard pattern. * should be used as a "throw away" variable throughout the whole of the C# 7 syntax.

HaloFour commented 8 years ago

@DavidArno

For starters, the functionality of the compiler for the MVP summit was far from complete.

Second, you were using type patterns, not record patterns. As such the proper syntax for omitting the variable is simply omitting the identifier, e.g. case Optional.NoValue: The method you used to implement Option<T> and implement it is very likely not the way you'd go about normally writing such code. Testing an Option<T> would more likely resemble: var result = x switch ( case Some(var value): $"int: {value}", case None: "None" );

The switch statement will likely retain default, but the switch expression won't since default has some very specific semantics which will not be retained in the expression form. Using default everywhere in place of * would be verbose and pretty ugly.

DavidArno commented 8 years ago

@HaloFour,

Thanks for your reply. However, the comment "For starters, the functionality of the compiler for the MVP summit was far from complete." really highlights the problem with trying to work out what on earth is going on with C# 7. That demo's write-up is the most up-to-date single demo of the language I can find. It was demonstrated at a significant event. Why wasn't the most up-to-date thinking shown there and where do I find what is the latest version? There's dozens of random issues, each with conflicting information with no one appearing to be responsible for maintaining a true record of what is going on. How do you expect those not in the "inner circle" to offer opinion in a meaningful way if every observation is met with a "your are using/reading the wrong version of the spec"?

If the "switch expression" offers a different syntax to the switch statement, why was match abandoned for the former?

Why if, for a "type", can if use case SomeType x but for a "record" (which presumably is still a type) must I use the more verbose case SomeType(var x)?

Where are these - apparently random, and contradictory - decisions documented?

HaloFour commented 8 years ago

@DavidArno

I understand your frustration, it's not always clear what the current thinking or design is for a proposal and the roadmap issues haven't been updated in forever. The MVP summit three months ago was probably the last time Microsoft did a big demonstration for an audience so they put extra effort into demonstrating the features and we haven't seen anything similar since. For the more active proposals, like pattern matching, the best glimpse any of us really get is the documents that they've been updating that are generally referenced in the specs.

As for why match was changed to switch, the best comment I have on that subject is https://github.com/dotnet/roslyn/issues/5154#issuecomment-153499813 which doesn't explain any of the reasoning. I assume it was to avoid introducing a new keyword.

The type pattern and record pattern solve very different problems. The type pattern just does a type check optionally assigning the cast type to the identifier so that you don't have to cast:

object obj = new Person("Gates", "Bill");

if (obj is Person person) { }
// roughly equivalent to
if (obj is Person) {
    Person person = (Person)person;
}

Whereas the record pattern deconstructs the properties of record types and permits pattern matching against them:

object obj = new Person("Gates", "Bill");

if (obj is Person("Gates", var firstName)) {
    Console.WriteLine($"obj is a Person with the last name of Gates and the first name {name}");
}
DavidArno commented 8 years ago

@HaloFour,

Bill Wagner talk on C# 7 at NDC London is now available at https://vimeo.com/154708153.

It backs up what you say regarding switch vs match: the syntax is likely to change from the code I got working. Apparently the code on Github hasn't been updated to reflect the spec change yet though.

Thanks for the clarification re type patterns and record patterns.

DavidArno commented 8 years ago

I feel I've significantly hijacked this issue away from its original purpose (to provide * as a universal throw-away variable declaration) and leading on from that to improve the strange, seemingly random replacement of default with * in the pattern implementation of switch.

I will therefore create a new issue to propose the latter change as a step to achieving the former.

gulshan commented 7 years ago

Latest developments on this issue- #14794

jcouv commented 7 years ago

@dsaf, RC.2 includes a first crack at the discard feature. They are allowed in out vars, deconstruction, patterns and also in assignment.

The latter is _ = expression; (assuming that no local called underscore was declared). (see test) For backwards compatibility reasons, if a local named underscore was already declared, then _ = expression; means assigning to that local.

I'll go ahead and close this issue. Thanks for the proposal.

DavidArno commented 7 years ago

@jcouv,

Is RC2 released yet? I thought RC2 was the update to vs15 released a few days back, but _ = expression gives me a Cannot resolve symbol '_' error with that update.

jcouv commented 7 years ago

@DavidArno Yes. I think RC.2 was released a week back. Can you share the specific version/build version of VS2017 you're using? Or even better, the git SHA of the csc it uses (from memory: look under program files folder, then Microsoft Visual Studio, then 2017, then Tools or MSBuild, then the properties on csc.exe).

I checked the dev15-rc2 branch and it does have the test too.

DavidArno commented 7 years ago

@jcouv,

The Properties -> Details view shows the product version as "2.0.0-rc2-61205-04. Commit Hash: 743c...". Not sure how to get at the rest of the number though, sorry.

Does that look like the right version?

I'm just trying to add a test line of _ = 1; to an existing method and it tells me it can't resolve the symbol. I may of course be misunderstanding this new feature and it doesn't work like that.

jcouv commented 7 years ago

@DavidArno I have the same version as you do (csc.exe shows 2.0.0-rc2-61205-04. Commit Hash: 743ce3b..." and VS shows "Version 15.0.26006.2 D15REL") and _ = 3; has no problem. What is the error message you see?

dsaf commented 7 years ago

@jcouv Thank you!

DavidArno commented 7 years ago

@jcouv,

My apologies: the problem existed between the chair and keyboard! It compiles fine; the error message I was seeing is understandably coming from R#, which hasn't caught up with this new feature yet. Sorry for wasting your time 😞 (and thanks for implementing this feature; really happy with it now I've worked out what I'm doing! 😁)