dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.71k stars 3.98k forks source link

Proposal: Language support for Tuples #347

Closed MadsTorgersen closed 7 years ago

MadsTorgersen commented 9 years ago

There are many scenarios where you'd like to group a set of typed values temporarily, without the grouping itself warranting a "concept" or type name of its own.

Other languages use variations over the notion of tuples for this. Maybe C# should too.

This proposal follows up on #98 and addresses #102 and #307.

Background

The most common situation where values need to be temporarily grouped, a list of arguments to (e.g.) a method, has syntactic support in C#. However, the probably second-most common, a list of results, does not.

While there are many situations where tuple support could be useful, the most prevalent by far is the ability to return multiple values from an operation.

Your options today include:

Out parameters:


public void Tally(IEnumerable<int> values, out int sum, out int count) { ... }

int s, c;
Tally(myValues, out s, out c);
Console.WriteLine($"Sum: {s}, count: {c}");  

This approach cannot be used for async methods, and it is also rather painful to consume, requiring variables to be first declared (and var is not an option), then passed as out parameters in a separate statement, then consumed.

On the bright side, because the results are out parameters, they have names, which help indicate which is which.

System.Tuple:

public Tuple<int, int> Tally(IEnumerable<int> values) { ... }

var t = Tally(myValues);
Console.WriteLine($"Sum: {t.Item1}, count: {t.Item2}");  

This works for async methods (you could return Task<Tuple<int, int>>), and you only need two statements to consume it. On the downside, the consuming code is perfectly obscure - there is nothing to indicate that you are talking about a sum and a count. Finally, there's a cost to allocating the Tuple object.

Declared transport type

public struct TallyResult { public int Sum; public int Count; }
public TallyResult Tally(IEnumerable<int> values) { ... }

var t = Tally(myValues);
Console.WriteLine($"Sum: {t.Sum}, count: {t.Count}");  

This has by far the best consumption experience. It works for async methods, the resulting struct has meaningful field names, and being a struct, it doesn't require heap allocation - it is essentially passed on the stack in the same way that the argument list to a method.

The downside of course is the need to declare the transport type. THe declaration is meaningless overhead in itself, and since it doesn't represent a clear concept, it is hard to give it a meaningful name. You can name it after the operation that returns it (like I did above), but then you cannot reuse it for other operations.

Tuple syntax

If the most common use case is multiple results, it seems reasonable to strive for symmetry with parameter lists and argument lists. If you can squint and see "things going in" and "things coming out" as two sides of the same coin, then that seems to be a good sign that the feature is well integrated into the existing language, and may in fact improve the symmetry instead of (or at least in addition to) adding conceptual weight.

Tuple types

Tuple types would be introduced with syntax very similar to a parameter list:

public (int sum, int count) Tally(IEnumerable<int> values) { ... }

var t = Tally(myValues);
Console.WriteLine($"Sum: {t.sum}, count: {t.count}");  

The syntax (int sum, int count) indicates an anonymous struct type with public fields of the given names and types.

Note that this is different from some notions of tuple, where the members are not given names but only positions. This is a common complaint, though, essentially degrading the consumption scenario to that of System.Tuple above. For full usefulness, tuples members need to have names.

This is fully compatible with async:

public async Task<(int sum, int count)> TallyAsync(IEnumerable<int> values) { ... }

var t = await TallyAsync(myValues);
Console.WriteLine($"Sum: {t.sum}, count: {t.count}");  

Tuple literals

With no further syntax additions to C#, tuple values could be created as

var t = new (int sum, int count) { sum = 0, count = 0 };

Of course that's not very convenient. We should have a syntax for tuple literals, and given the principle above it should closely mirror that of argument lists.

Creating a tuple value of a known target type, should enable leaving out the member names:

public (int sum, int count) Tally(IEnumerable<int> values) 
{
    var s = 0; var c = 0;
    foreach (var value in values) { s += value; c++; }
    return (s, c); // target typed to (int sum, int count)
}

Using named arguments as a syntax analogy it may also be possible to give the names of the tuple fields directly in the literal:

public (int sum, int count) Tally(IEnumerable<int> values) 
{
    var res = (sum: 0, count: 0); // infer tuple type from names and values
    foreach (var value in values) { res.sum += value; res.count++; }
    return res;
}

Which syntax you use would depend on whether the context provides a target type.

Tuple deconstruction

Since the grouping represented by tuples is most often "accidental", the consumer of a tuple is likely not to want to even think of the tuple as a "thing". Instead they want to immediately get at the components of it. Just like you don't first bundle up the arguments to a method into an object and then send the bundle off, you wouldn't want to first receive a bundle of values back from a call and then pick out the pieces.

Languages with tuple features typically use a deconstruction syntax to receive and "split out" a tuple in one fell swoop:

(var sum, var count) = Tally(myValues); // deconstruct result
Console.WriteLine($"Sum: {sum}, count: {count}");  

This way there's no evidence in the code that a tuple ever existed.

Details

That's the general gist of the proposal. Here are a ton of details to think through in the design process.

Struct or class

As mentioned, I propose to make tuple types structs rather than classes, so that no allocation penalty is associated with them. They should be as lightweight as possible.

Arguably, structs can end up being more costly, because assignment copies a bigger value. So if they are assigned a lot more than they are created, then structs would be a bad choice.

In their very motivation, though, tuples are ephemeral. You would use them when the parts are more important than the whole. So the common pattern would be to construct, return and immediately deconstruct them. In this situation structs are clearly preferable.

Structs also have a number of other benefits, which will become obvious in the following.

Mutability

Should tuples be mutable or immutable? The nice thing about them being structs is that the user can choose. If a reference to the tuple is readonly then the tuple is readonly.

Now a local variable cannot be readonly, unless we adopt #115 (which is likely), but that isn't too big of a deal, because locals are only used locally, and so it is easier to stick to an immutable discipline if you so choose.

If tuples are used as fields, then those fields can be readonly if desired.

Value semantics

Structs have built-in value semantics: Equals and GetHashCode are automatically implemented in terms of the struct's fields. This isn't always very efficiently implemented, so we should make sure that the compiler-generated struct does this efficiently where the runtime doesn't.

Tuples as fields

While multiple results may be the most common usage, you can certainly imagine tuples showing up as part of the state of objects. A particular common case might be where generics is involved, and you want to pass a compound of values for one of the type parameters. Think dictionaries with multiple keys and/or multiple values, etc.

Care needs to be taken with mutable structs in the heap: if multiple threads can mutate, tearing can happen.

Conversions

On top of the member-wise conversions implied by target typing, we can certainly allow implicit conversions between tuple types themselves.

Specifically, covariance seems straightforward, because the tuples are value types: As long as each member of the assigned tuple is assignable to the type of the corresponding member of the receiving tuple, things should be good.

You could imagine going a step further, and allowing pointwise conversions between tuples regardless of the member names, as long as the arity and types line up. If you want to "reinterpret" a tuple, why shouldn't you be allowed to? Essentially the view would be that assignment from tuple to tuple is just memberwise assignment by position.

(double sum, long count) weaken = Tally(...); // why not?
(int s, int c) rename = Tally(...) // why not?

Unification across assemblies

One big question is whether tuple types should unify across assemblies. Currently, compiler generated types don't. As a matter of fact, anonymous types are deliberately kept assembly-local by limitations in the language, such as the fact that there's no type syntax for them!

It might seem obvious that there should be unification of tuple types across assemblies - i.e. that (int sum, int count) is the same type when it occurs in assembly A and assembly B. However, given that structs aren't expected to be passed around much, you can certainly imagine them still being useful without that.

Even so, it would probably come as a surprise to developers if there was no interoperability between tuples across assembly boundaries. This may range from having implicit conversions between them, supported by the compiler, to having a true unification supported by the runtime, or implemented with very clever tricks. Such tricks might lead to a less straightforward layout in metadata (such as carrying the tuple member names in separate attributes instead of as actual member names on the generated struct).

This needs further investigation. What would it take to implement tuple unification? Is it worth the price? Are tuples worth doing without it?

Deconstruction and declaration

There's a design issue around whether deconstruction syntax is only for declaring new variables for tuple components, or whether it can be used with existing variables:

(var sum, var count) = Tally(myValues); // deconstruct into fresh variables
(sum, count) = Tally(otherValues); // deconstruct into existing variables?

In other words is the form (_, _, _) = e; a declaration statement, an assignment expression, or something in between?

This discussion intersects meaningfully with #254, declaration expressions.

Relationship with anonymous types

Since tuples would be compiler generated types just like anonymous types are today, it's useful to consider rationalizing the two with each other as much as possible. With tuples being structs and anonymous types being classes, they won't completely unify, but they could be very similar. Specifically, anonymous types could pick up these properties from tuples:

Once in the language, there are additional conveniences that you can imagine adding for tuples.

Tuple members in scope in method body

One (the only?) nice aspect of out parameters is that no returning is needed from the method body - they are just assigned to. For the case where a tuple type occurs as a return type of a method you could imagine a similar shortcut:

public (int sum, int count) Tally(IEnumerable<int> values) 
{
    sum = 0; count = 0;
    foreach (var value in values) { sum += value; count++; }
}

Just like parameters, the names of the tuple are in scope in the method body, and just like out parameters, the only requirement is that they be definitely assigned at the end of the method.

This is taking the parameter-result analogy one step further. However, it would special-case the tuples-for-multiple-returns scenario over other tuple scenarios, and it would also preclude seeing in one place what gets returned.

Splatting

If a method expects n arguments, we could allow a suitable n-tuple to be passed to it. Just like with params arrays, we would first check if there's a method that takes the tuple directly, and otherwise we would try again with the tuple's members as individual arguments:

public double Avg(int sum, int count) => count==0 ? 0 : sum/count;

Console.WriteLine($"Avg: {Avg(Tally(myValues))}");

Here, Tally returns a tuple of type (int sum, int count) that gets splatted to the two arguments to Avg.

Conversely, if a method expects a tuple we could allow it to be called with individual arguments, having the compiler automatically assemble them to a tuple, provided that no overload was applicable to the individual arguments.

I doubt that a method would commonly be declared directly to just take a tuple. But it may be a method on a generic type that gets instantiated with a tuple type:

var list = List<(string name, int age)>();
list.Add("John Doe", 66); // "unsplatting" to a tuple

There are probably a lot of details to figure out with the splatting and unsplatting rules.

RichiCoder1 commented 9 years ago

So much :+1:. A lot of useful applications for this. I'd vote yes for unification across assemblies, as there could be legitimate cases where being able to return a Tuple would best match the intentions of an library's API (eg. multiple returns). Destruction into existing variables might confuse developers, but I could see many cases where you might want it. (ex:)

int counter;
bool shouldDoThing;
try {
    (counter, shouldDoThing) = MyMethod(param);
} catch (Exception ex) {
    // Handle exception
}

Tuple members in scope sounds very useful. How would it handle cases like return yield though? Just wouldn't be allowed?

I think I'm against having a Tuple be able to be implicitly "splat". I'd be a much bigger fan of a javascript-esque spread operator so rather than

public double Avg(int sum, int count) => count==0 ? 0 : sum/count;

Console.WriteLine($"Avg: {Avg(Tally(myValues))}");

you'd do

public double Avg(int sum, int count) => count==0 ? 0 : sum/count;

Console.WriteLine($"Avg: {Avg(...Tally(myValues))}");

or something similar. In the grand scheme of things though, this may not be nearly as confusing to developers as I'm thinking.

axel-habermaier commented 9 years ago
  1. F#'s tuples are reference types, apparently a decision made after performance measurements of the F# compiler. I would agree, though, that value types are preferable. Even with all the syntactic support from the proposal, you probably won't use tuples as much in C# as in F#. Anyway, I'm just saying that you maybe should talk to the F# guys about this.
  2. Splatting would be very useful in my opinion as well as having the members of out-tuples in scope.
  3. F# supports an automatic "conversion" of methods with out parameters to methods with a tuple return type, so instead of let result = dictionary.TryGetValue(key, &value) you can just write let (result, value) = dictionary.TryGetValue(key). That might be worth considering so that old APIs can automatically take advantage of the new syntax. The order of the elements in the tuple should probably following the order of appearance in the method signature; that is, the actual return value first, following by all out parameters in sequence.
  4. I like where you go with the tuple and anonymous type syntax, i.e. (int p1, string p2) for tuples and {int P1, string P2} for anonymous types. It's unrelated to tuples, but I'd also like to see such syntactic sugar for delegates, so that I can write, as in C++, Func<int(int, int)> or maybe even with parameter names Func<int(int p1, int p2)>. Or with a tuple return type Func<(int r1, int r2)(int p1, int p2)>.
  5. The current tuple declaration syntax (int p1, string p2) suffers the same deficiency as the record proposal: Parameters start with lower case characters, whereas the resulting properties should be upper case, so that you can access the tuple with tuple.MyProperty instead of tuple.myProperty. Probably a unified solution should be considered for both. Though I don't like the solution of the record proposal (namely, to specify both names in the form of int x : X); I'd much rather prefer to declare the parameter lower case and have them converted to upper case automatically. It's true that you encode coding style into the compiler that way, but who isn't using the default .NET naming conventions anyway?
MgSam commented 9 years ago

I think the proposal is generally good; however, some issues that came to mind:

omariom commented 9 years ago

Actually simple value types may not incur the cost associated with copying as they are good candidates for inlining.

orthoxerox commented 9 years ago

I think simple cases like (var sum, var count) = Tally(myList) could very well be optimized into simply pushing two values onto the stack, and then popping them into the new variables, so there would be no tuple creation overhead.

ufcpp commented 9 years ago

Can we use "empty tuple"? If the empty tuple () is allowed, we can use it as Void or so-called Unit type which solves the issue #234 without CLR enhancement.

() M() => (); // instread of void M() {}
Func<()> f = () => (); // instead of Action
axel-habermaier commented 9 years ago

@ufcpp: Great idea!

erik-kallen commented 9 years ago

Just a thought, but is it possible to use the System.Tuple<> type(s) for this?

(int a, int b) M((int sum, int count) x)

could be transformed to

[return: TupleMemberNames("a", "b")] Tuple<int, int> M([TupleMemberNames("sum", "count")] Tuple<int, int> x)

Of couse, this would not work straight with generics, but I imagine that problem is rather close to the object/dynamic differentiation that is already implemented.

This idea does mean that the return value is not a value type, but it does solve the unification between assemblies issue.

gafter commented 9 years ago

@erik-kallen The disadvantages of the existing tuple types is that

  1. they are reference types, and
  2. you cannot change the names of the members: they are Item1 and Item2, etc.
erik-kallen commented 9 years ago

@gafter I probably wasn't clear enough when I wrote what I did, but my intention was that when the compiler encounters these special attributes it could create a temporary type which is has a runtime type of System.Tuple, but with aliases for the members so if you have a parameter declared with [TupleMemberNames("sum", "count")] Tuple<int, int> x, then the access x.a would be translated to x.Item1 by the compiler, and the source code needs not care that it is actually a System.Tuple.

I acknowledge the reference type thing to be an issue in my idea, though.

gafter commented 9 years ago

@erik-kallen We're actively exploring the idea of "erasing" the member names using attributes as you suggest. However we're leaning toward using struct versions of tuples.

ryanbnl commented 9 years ago

How would this be supported in lambdas? For example, I have this method:

public static void DoSomething(Func<string, (string x, string y)> arg)

With type-inference, the argument looks like this:

(a) => { return (x=a, y=a); }

Is the return type always inferred? If not, you get something really weird:

(string x, string y) (a) => { return (x=a, y=a); }

gafter commented 9 years ago

@ryanbl They would be target-typed. In the absence of a target type

paulomorgado commented 9 years ago

Regardless of being a value or reference type, as @erik-kallen, I envision tuples to be like System.Tuple...

The diference between tuples and anonymous types is that anonymous types are types with properties with well defined names and types that you can pass around to libraries like ASP.NET routing and Dapper and tuples have well defined properties (Item1,Item2, ...) that can be aliased at compile time.

However, in order to make them useful for function return values, the compiler could apply an attribute with the compile time names, like parameter names are part of the function signature.

MgSam commented 9 years ago

Another thought- while not directly related to tuples, the idea that the compiler can provide a strongly-named way of using tuples (avoiding Item1 and Item2) seems like it could be extended to making a more strongly named sort of dictionary (where the items have more meaningful names than .Key and .Value).

I often run into situations where I need to index some collection on more than one dimension and you're forced to either build custom types or use multiple dictionaries which might share the exact same type signature. If that happens then the only differentiation is in the variable name and possibly XML comments.

Example:

class Baz {
     public String LongName {get; private set;}
     public String ShortName {get; private set;}

     ...
}

void Foo(IEnumerable<Baz> enumerable) {
    var lookup = enumerable.ToDictionary(e => e.ShortName, e => e);
    ...
    Bar(lookup);
}

void Bar(IDictionary<String, Baz> lookup) {
    //Now what did lookup index by? LongName or ShortName? I need to check calling code or documentation to know. 
}
gafter commented 9 years ago

@MgSam This proposal explicitly describes support for tuples with named members.

coldacid commented 9 years ago

I feel it would be important for tuples to work across assemblies. I could think of a few places in most of the projects I've had at work that would be improved by this proposal, and in almost every case it's in an interface implemented in one assembly and consumed in another. Whether or not the existing Tuple classes are used, serious consideration should be made that this proposal allows for exposure of tuple returns across assemblies.

Maybe it'd be good to add struct analogues to the existing classes?

MgSam commented 9 years ago

@gafter I know, I just thought the problem was similar enough to warrant mentioning here. There really isn't much of a distinction in Github between proposal threads and discussion threads and in any case I don't have a good enough solution in mind to make a separate proposal.

billchi-ms commented 9 years ago

Some of this has been said, so some of this is +1 to those comments :-). My thoughts (albeit colored by a decade plus of hacking Common Lisp) ...

Do not conflate immutability and tuples, let me decide independently how they are used because I know sometimes I want to mutate some state.

Do use structs because a primary use is Multiple Value Return (MVR), and that should be efficient. In the same way spurious boxing affected Roslyn perf, an inner loop using MVR could become a perf impact in a large but responsive application.

I can't believe I'm asking for more syntax, but please consider "return values(...)" akin to return yield, where I make it very clear to the reader I'm return multiple values. Though I do admit the parenthesis and no comma op is almost manifest enough, but I feel I want "return values" :-).

Tuples will definitely be used beyond MVR. We often have little carrier or terd types that sucks to have to name and make more first class. Consider in one place in my program I need a list of Foo's, and it turns out I'd like to timestamp the items in the list. The timestamp is meta to the abstraction of being a Foo, and I really don't want to declare FooWithTimeStampForSinglePlaceUsage :-). Note too, in this scenario I want to mutate the timestamp with user activity.

Do support for productivity partially supplied tuple elements and allow me to deconstruct to fewer vars than elements. Perhaps in the MVR case I can declare default values for elements if they are not supplied, or you just default to default(T). This works well too with your idea of using named arg like syntax for filling in SOME of the elements and defaulting others. OFTEN when using MVR you do not need all the values returns because the extra values are only sometimes helpful. Imagine Truncate returns the Floor of x/y and a remainder, but 90% of the time I just need the first value of the integer division. It may be too much for C#, but I'd also consider if I having a deconstructing binding site with more variables declared than the expression returns, then you just fill in my extra values with default(T) ... I'll fill them in with actual values in the next few lines of code, but now I have the tuple I want already in hand without an intermediary needed.

I didn't think too deeply, but it seems some sort of unification across assms is needed for a smooth MVR experience (where these may be used the most).

I'd also unify with anon types at least to the extent of implicit conversions (note, this would be for productivity coding, but yes, too much convenience in the hands of the masses can lead to too much inefficiency in code :-)).

I really think what you call "tuple members in scope" is VERY NOT C#. It smacks of an expression based language (which C# is not) where falling off functions returns the last value of the last expression of whatever branch you were in. It is also very subtle for C#, and I think the MVR feature should be a bit more explicit, like 'ref', for ready readability.

I like adding splatting, but I think it should be explicit (a la funcall vs. apply in Common Lisp, or * in python). I get we already to some not readily readable resolution around paramarrays, but I'd strongly consider breaking from the precedent here for manifest readability.

Thanks for listening! Bill

paulomorgado commented 9 years ago

@billchi-ms, some proposals for C#7 make it look like we should never have to write types again. but we should stay away of that temptation.

I like the proposal and I think it should be optimized for MRV because it's what we don't have today. Many are using the current Tuple types whit the readability penalties and that should be improved.

I've recently started using Python to work on an existing code base and I find value on tuples as return types. Other than that, I'm a bit cautious.

armenmk commented 9 years ago

Regarding this:

public (int sum, int count) Tally(IEnumerable<int> values) { ... }

why not express it like this instead?

public {int sum, int count} Tally(IEnumerable<int> values) { ... }

curly braces express the returned structure, parentheses are perceived rather as function arguments.

coldacid commented 9 years ago

@armenmk I think I'd prefer parens to curlies for the returns. Most languages (that I've dealt with, anyway) that allow for multiple returns use parens on the left side for placing the return values into variables, and curly braces always indicate blocks of code.

armenmk commented 9 years ago

Like this a lot.

public (int sum, int count) Tally(IEnumerable<int> values) 
{
    sum = 0; count = 0;
    foreach (var value in values) { sum += value; count++; }
}

Among others it decouples the flow definition from the returned value(s). Remember how often you create a local variable, assign/reassign it in the function and then return. This is it, but shorter and less verbose.

DavesApps commented 9 years ago

There has actually been a request for multiple return values for some time. I'm not a big fan of tuples but rather built in syntax for supporting multiple return values. Take a look at this conversation line for more ideas folks have had: http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2083753-return-multiple-values-from-functions-effortlessly

Some function definitions like: public {[default]bool success, MyClass2 Class2} MyFunction(); or public static TryParseResult {bool Success,int value, string Error} Int32.TryParse(string value); perhaps

This would allow default handling if desired to just use default values and support backward compatibility for methods that changed to support multiple types.

So something like:

if (Int32.TryParse(mystring)) //would still work

But also var ret=Int32.TryParse(...)

would offer the ability to access the error string if desired.

Providing functional refactoring capability as well as the ability to provide additional information as part of a return if needed.

iam3yal commented 9 years ago

Not sure whether people read the discussion over uservoice so I'll just post my suggestion here.

For a very long, long time I've been jealous with some languages that can just return multiple values out of a function, especially the way it's implemented in Ruby and Lua it's just beyond amazing and simple.

Here is the way we can take advtange over this feature in Lua

function GetPoint2D(x, y)
    -- Do something with x and y
    return x, y
end

local x, y = GetPoint2D(1, 2)

I know we can use Tuples, Arrays, ref/out and whatnot to do the same thing but they all have too few pros if at all and many cons in the context of this problem and I'll elaborate.

Tuples saves you from creating a new class to hold some values but then the properties are unnamed.

So we can already access var in the local scope of the function and get everything to work nicely, we just need a way to expose the anonymous type that the compiler creates and I thought that it makes sense to use the var keyword to do it.

It would be really nice to have something along these lines.

public var GetPoint2D(float x, float y)
{
    // Do something with x and y
    return new { X = x, Y = y };
}

And then the usage is quite simple and trivial.

var point = GetPoint2D();

I'm not sure what are the challenges here and whether it's possible but it can be quite amazing to have a solution for this rather than all these hackish approaches that clutter the code and make maintainability and everything else quite hard.

fjovine commented 9 years ago

I have a couple of comments about this subject.

  1. Go explicitly supports multiple valued return as a tool to eliminate exceptions. See https://golang.org/doc/faq#exceptions so a possible long term effect could be that of reducing the exception usage if implemented and correctly included in the libraries.
  2. Polymorphism in the OO languages I know, is only implemented for function parameters, not for return values, although this could be useful in some situations. Out parameters in C# give some flexibility over java, for instance, but explicit usage of polymorphism on multiple return types could be helpful.

A previous example shows well what I mean.

This code increments count uselessly if not needed.

public (int sum, int count) Tally(IEnumerable<int> values) 
{
    sum = 0; count = 0;
    foreach (var value in values) { sum += value; count++; }
}

If polymorphism could take into account return types as well, one could write a second method

public int Tally(IEnumerable<int> values) 
{
    int sum = 0;
    foreach (var value in values) { sum += value; }
}

So from the user side one could write

(sum, count) = Tally(intEnum); 

and the compiler would select the first version

sum = Tally(intEnum);

and the compiler would select the second one.

bc3tech commented 9 years ago

Really like this idea. I did use DynamicObject to create Python's namedtuple construct using classes, but of course it lacks intellisense and isn't a language construct, just a class.

iraSenthil commented 9 years ago

Love it!.

jacobcarpenter commented 9 years ago

Does the destructuring/deconstruction syntax require you to use the names of the fields of the tuple exactly?

That certainly doesn't feel symmetric to invoking a method. You don't have to name your local variables the same names as a method's parameters, if you want to pass them as arguments.

I really value that you want to name the tuple's values to give them meaning, but it's very easy to conceive of scenarios where I want to call a new-tuple returning method, but I already happen to have a local variable with a conflicting name.

It seems like deconstruction should allow callers to escape the names that the method author originally chose. Deconstruction seems like a very caller-focused feature already, anyway.

scott-fleischman commented 9 years ago

I propose these be called records rather than tuples. A tuple tends to have unnamed elements, whereas this proposal is about explicitly giving names to the elements.

AZBob commented 9 years ago

IMO, the comma should be an operator rather than a syntactic separator. The names of the values aren't really useful as much as their positions. "Declaring" them in the function signature has merit, IMO, but declaring their types seems like duplication. Example:

private (,,,) MyTupleReturningFunction(int p1, string p2) {

  long val1;
  int val2;
...
  return val1, val2, val1 + val2 / 42;
}

Less typing (both with the keyboard and declarations), and I think that's pretty obvious what's happening. I think this strikes a good balance between documenting the function's return and having to declare every little thing. Having to declare a named return "parameter" in the function signature and set that variable explicitly within the scope of the function seems a bit clunky and duplicative since the real power behind the Tuple is its position, not its name. The types of the return types don't need to be declared both in the function signature and whatever variable is being returned, IMO; that can be discovered from the return statement.

Also, it would be nice if the compiler allowed less variables to be assigned on function return than the function provides (using the prior function signature):

var ephemeral1, var ephemeral2 = MyTupleReturningFunction(val1, val2);

The opposite (allowing more returns than the function provides) could cause problems during refactorings later, as well as non-obvious null/default values during runtime, which could cause hard to find bugs if/when the developer misreads the function signature's return.

Also, it would be nice to allow short-cutting with var like that so that the types all of the return values don't need to be declared. The prior example would become:

var ephemeral1, ephemeral2 = MyTupleReturningFunction(val1, val2);
NightElfik commented 9 years ago

Great proposal! One thing I would like to see in addition to already proposed features is ability to omit certain returned values. Sometimes I am interested in only some values and language should not force me to name other variables that I don't care about:

(var sum, _) = Tally(myValues);  // I do not care about the second value.
hclarke commented 9 years ago

I like the idea of making them value types, to avoid allocations and such.

I'm not convinced that named tuple elements are the right choice, though. This could instead be implemented with a struct version of System.Tuple<...> (System.ValueTuple<...> ?), and some syntactic sugar.

names make them more self-documenting, but at the cost of verbosity, compiler complexity, and trouble crossing assembly boundaries.

System.ValueTuple<...>

-same as System.Tuple<...>, but a value type -implement it for 0-8 generic parameters (or more) -as a type, (T0,T1) becomes System.ValueTuple<T0,T1> -as an l value, (a,b) = x; becomes var newSymbol = x; a = newSymbol.Item1; b = newSymbol.Item2; -as an r value, (a,b) becomes ValueTuple.Create(a,b)

for splatting: it might make sense to have explicit syntax for splatting. so, var x = (a,b); foo(x) doesn't get any special treatment, but var x = (a,b); foo(@x) would get splatted.

aL3891 commented 9 years ago

I like this but i'd also like the syntax to be more similar to anon types, like using { string Name, int Age} as you suggest. Or maybe that's the distinction, {...} = class, (...) = struct/tuple

Also, how about having a keyword that lets the compiler infer the return type? That would allow you to return an anon type and that would then be a tuple (class/struct discussion aside)

Consider this

public magicKeyword GetFruit(){
   return new {Name = "Banana", Color = "Yellow"}
}

This would also fit in very well with linq, i've quite often found my self wanting to return something that selects an anon type, and to me this is very similar to the tuples we're talking about here, at least conceptually.

Maybe to avoid language colissions we could have

public ! GetFruit(){
   return new {Name = "Banana", Color = "Yellow"}
}

since it's an upside down "i" as in inferred :) or maybe

public ? GetFruit(){
   return new {Name = "Banana", Color = "Yellow"}
}

or

public <> GetFruit(){
   return new {Name = "Banana", Color = "Yellow"}
}

Maybe that's mosts inline with the rest of the language, like when you describe a generic type without a type argument, typeof(List<>) i'm sorry if this going of topic for this proposal but it feels relevant to me :)

UnstableMutex commented 9 years ago

Where can I vote something like "GOD no microsoft dont do that please"?

iam3yal commented 9 years ago

@UnstableMutex, when you would have a valid reason and a constructive comment then maybe someone will listen, besides, no one is forcing you to use it, heck, you can even remove it yourself! it's open source, you know...

So I'll go head and ask you why in the world you wouldn't want it?

kbirger commented 9 years ago

While I am happy to see a language I love growing, I am concerned that it seems to be growing in all different directions. Do we really need to add more syntax, making the language harder to read and optimize?

In my experience, if you group something once, you'll end up grouping it again. If you group it more than once, you should declare a structure or class. We are, after all, working in an object-oriented language. Following that pattern, you will be able to write good code that requires less maintenance later when you want to add more values, or to add a method. It adds more context and transportability.

Not to be melodramatic, but we are on our way to becoming PERL with proposals like this. PERL also has lots of special syntax for writing various minute things succinctly and "elegantly".

@eyalesh, while I agree with you that @UnstableMutex was not being constructive, "why not?" Is also not a constructive comment, nor is it a justification for the need to add this.

C# is not Ruby. C# does not function the same way internally. If you prefer Ruby, that's fine! It's a fun language, and I'm sure you could switch over and work primarily in Ruby.

While nobody is forced to USE features, we are forced to deal with code written by others using these features for bad reason.

iam3yal commented 9 years ago

@kbirger, You're focus too much on the syntax (that you have to learn and read) and that people may abuse it but what about the important issues like maintenance? productivity? or actually making the code more readable? are these not real concerns? for two minor points you made I made three major ones.

I doubt people want to write objects that are not part of the actual domain of the software, not to mention maintenance such code! developers don't want to create temporary objects just to hold values or making unreadable arrays or tuples!

Just because you're using an object-oriented language doesn't mean you need to have primitive features in the language, by your own definition let's not use any of the features and keep it as primitive as possible!

Still, I'll correct you, C# is not just an object-oriented language but a multi-paradigm language, meaning, it supports more concepts of programming style.

You need to do the math but the more primitive you are, you need to work more, maintain more and read more code! just like in math x^2 = x * x, now raise the power and work harder writing these multiplications... I prefer using the short version and understand the long one.

I don't get this argument where developed are amazed that they actually need to learn new things and new concepts, so let me break it to you as a developer, your second job is to expand your horizons.

I don't program in Ruby and I have no intention to do so, among many programming languages that I use, there are 5 that I really love and C# is on the top of the list, so I want nothing but for the language to be better and improve.

Don't be so focus on your own needs, some people have other needs and they are real, just like yours.

UnstableMutex commented 9 years ago

guys, sorry for my non constructive comment i have details to add to @kbirger message. i'll write it.

vas6ili commented 9 years ago

@kbirger Ruby is not the only language that supports tuples or named tuples. Quite a lot of newer languages have built-in tuple syntax and some form of pattern matching or destructuring. Besides, F# functions the same way as C# internally (Common Language Infrastructure) and nobody complains about tuple support in F#.

In my opinion, being able to group related data into structures without giving it a name is a big benefit. When writing complex enterprise business logic, I am always struggling to find appropriate names for simple tuple-like classes that are used internally only in a few places. The alternative of using Tuple's Item1... ItemN properties doesn't make my code more readable.

Additionally, since scripting is finally coming to C#, tuples will only improve the experience of it. The REPL becomes less useful if you have to write a lot boilerplate declaration code.

In some situations this feature can even enable useful utility functions. Quite often I find myself writing code like below:

var fooTask = GetFooAsync();
var barTask = GetBarAsync();
await Task.WhenAll(fooTask, barTask);
// using fooTask.Result and barTask.Result multiple times creates more noise

in contrast to C# version with tuples support

// foo and bar are not Task<> instances
(var foo, var bar) = await TaskEx.WhenAll(GetFooAsync(), GetBarAsync());

// implementation of TaskEx.WhenAll
public async (T1, T2) WhenAll<T1, T2>(Task<T1> task1, Task<T2> task2)
{
    await Task.WhenAll(task1, task2);
    return (task1.Result, task2.Result);
}
UnstableMutex commented 9 years ago

@vas6ili I think being able to group related data into structures without giving it a name Is a huge mistake. what if you break SRP principle? are you sure that you need one object not two or more? if i cant think out name of the class i consider i need some refactoring... yes I dont use tuple class, i dont use anonimous types because of if i use tuple'string,'string as example after few month i dont want to remember what string is Item1 and what is Item2

UnstableMutex commented 9 years ago

Another reason to avoid any tuples is missed xml commenting ability on it

UnstableMutex commented 9 years ago

I writed how: I'll repeat: if you cant name some class probably you must divide it into few small classes.

gafter commented 9 years ago

@eyalesh Please keep it civil.

gafter commented 9 years ago

@eyalesh We're here to do the right thing for users, no matter what the motives. Questioning motives is only one form of ad hominem argument you used; ad hominem attacks are unwelcome.

iam3yal commented 9 years ago

@gafter, It was never my intention but each person is entitled to his own opinions so you can take think whatever you like, best regards.

p.s. I've deleted the irrelevant posts.

vas6ili commented 9 years ago

@UnstableMutex In some situations it is a bad design, in others not. I don't think library developers will start replacing classes with named tuples if this feature gets into c# vnext. Sometimes the name of the class is defined by its shape and structure and I'd rather have (string customerName, int nrOfVisits) then CustomerNameAndNrOfVisits class. Especially in application code, I find myself creating a lot of these simple record-like classes and having built-in support helps a lot. The abundance of these kind of classes only makes it more difficult to browse and understand a new project. Of course, as any feature, it will be abused, but this is not a sufficient reason for not having it.

are you sure that you need one object not two or more?

Adding a new item to the tuple will actually force me to change every calling site which might be beneficial. A new class member might be overlooked without some "Find All References" help in Visual Studio.

I writed how: I'll repeat: if you cant name some class probably you must divide it into few small classes.

I don't see how splitting into smaller classes is a solution when I have to return 2 simple values

UnstableMutex commented 9 years ago

@vas6ili In some situations it is a bad design, in others not. Agreed. I don't see how splitting into smaller classes is a solution when I have to return 2 simple values. Agreed if simple values. but if class name pattern is somethingANDotherthing I shall always think that breaks srp. in the customer example I think in general: class Customer {string Name} class VisitInfo {Customer, Visitnumber} (what about adding surname to customer? what about adding Patronymic, gender, casing in future) this two classes would be better. but there I can make mistake - i dont know BL. but again if I 'm going to name class somethingANDotherthing i think twice.

iam3yal commented 9 years ago

@UnstableMutex, The name of the class doesn't violate SRP per se, what violates it is when you design a class that has a well defined responsibility and after awhile it doesn't match its original design and do more than it was intended for.

SRP is the product of a design and implementation mismatch but this discussion is about having a feature that will allow you to reduce the amount of classes you implement, especially temporary ones and private classes that do not represent anything useful but carry data.

Now, imagine you have some tiny private classes with some variations that carry data and are used by the same class, naming them differently can be troublesome and annoying, it can also make some APIs quite ugly if you need to expose these classes.

UnstableMutex commented 9 years ago

@eyalesh yes The name of the class doesn't violate SRP, but name like that says that class may be violate srp. I dont know reasons to reduce amount of classes in general. class is unit of OOP like var, property, method. Are you often reduce variable number? but I think we're going offtopic. About topic feature: If that will be implemented I want ability to fast refactor tuple to class... even better if it will be under "experimental" checkbox and/or under red warning !!!are you sure?!!!