dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
18.96k stars 4.03k forks source link

Proposal: Language support for Tuples #347

Closed MadsTorgersen closed 7 years ago

MadsTorgersen commented 9 years ago

There are many scenarios where you'd like to group a set of typed values temporarily, without the grouping itself warranting a "concept" or type name of its own.

Other languages use variations over the notion of tuples for this. Maybe C# should too.

This proposal follows up on #98 and addresses #102 and #307.

Background

The most common situation where values need to be temporarily grouped, a list of arguments to (e.g.) a method, has syntactic support in C#. However, the probably second-most common, a list of results, does not.

While there are many situations where tuple support could be useful, the most prevalent by far is the ability to return multiple values from an operation.

Your options today include:

Out parameters:


public void Tally(IEnumerable<int> values, out int sum, out int count) { ... }

int s, c;
Tally(myValues, out s, out c);
Console.WriteLine($"Sum: {s}, count: {c}");  

This approach cannot be used for async methods, and it is also rather painful to consume, requiring variables to be first declared (and var is not an option), then passed as out parameters in a separate statement, then consumed.

On the bright side, because the results are out parameters, they have names, which help indicate which is which.

System.Tuple:

public Tuple<int, int> Tally(IEnumerable<int> values) { ... }

var t = Tally(myValues);
Console.WriteLine($"Sum: {t.Item1}, count: {t.Item2}");  

This works for async methods (you could return Task<Tuple<int, int>>), and you only need two statements to consume it. On the downside, the consuming code is perfectly obscure - there is nothing to indicate that you are talking about a sum and a count. Finally, there's a cost to allocating the Tuple object.

Declared transport type

public struct TallyResult { public int Sum; public int Count; }
public TallyResult Tally(IEnumerable<int> values) { ... }

var t = Tally(myValues);
Console.WriteLine($"Sum: {t.Sum}, count: {t.Count}");  

This has by far the best consumption experience. It works for async methods, the resulting struct has meaningful field names, and being a struct, it doesn't require heap allocation - it is essentially passed on the stack in the same way that the argument list to a method.

The downside of course is the need to declare the transport type. THe declaration is meaningless overhead in itself, and since it doesn't represent a clear concept, it is hard to give it a meaningful name. You can name it after the operation that returns it (like I did above), but then you cannot reuse it for other operations.

Tuple syntax

If the most common use case is multiple results, it seems reasonable to strive for symmetry with parameter lists and argument lists. If you can squint and see "things going in" and "things coming out" as two sides of the same coin, then that seems to be a good sign that the feature is well integrated into the existing language, and may in fact improve the symmetry instead of (or at least in addition to) adding conceptual weight.

Tuple types

Tuple types would be introduced with syntax very similar to a parameter list:

public (int sum, int count) Tally(IEnumerable<int> values) { ... }

var t = Tally(myValues);
Console.WriteLine($"Sum: {t.sum}, count: {t.count}");  

The syntax (int sum, int count) indicates an anonymous struct type with public fields of the given names and types.

Note that this is different from some notions of tuple, where the members are not given names but only positions. This is a common complaint, though, essentially degrading the consumption scenario to that of System.Tuple above. For full usefulness, tuples members need to have names.

This is fully compatible with async:

public async Task<(int sum, int count)> TallyAsync(IEnumerable<int> values) { ... }

var t = await TallyAsync(myValues);
Console.WriteLine($"Sum: {t.sum}, count: {t.count}");  

Tuple literals

With no further syntax additions to C#, tuple values could be created as

var t = new (int sum, int count) { sum = 0, count = 0 };

Of course that's not very convenient. We should have a syntax for tuple literals, and given the principle above it should closely mirror that of argument lists.

Creating a tuple value of a known target type, should enable leaving out the member names:

public (int sum, int count) Tally(IEnumerable<int> values) 
{
    var s = 0; var c = 0;
    foreach (var value in values) { s += value; c++; }
    return (s, c); // target typed to (int sum, int count)
}

Using named arguments as a syntax analogy it may also be possible to give the names of the tuple fields directly in the literal:

public (int sum, int count) Tally(IEnumerable<int> values) 
{
    var res = (sum: 0, count: 0); // infer tuple type from names and values
    foreach (var value in values) { res.sum += value; res.count++; }
    return res;
}

Which syntax you use would depend on whether the context provides a target type.

Tuple deconstruction

Since the grouping represented by tuples is most often "accidental", the consumer of a tuple is likely not to want to even think of the tuple as a "thing". Instead they want to immediately get at the components of it. Just like you don't first bundle up the arguments to a method into an object and then send the bundle off, you wouldn't want to first receive a bundle of values back from a call and then pick out the pieces.

Languages with tuple features typically use a deconstruction syntax to receive and "split out" a tuple in one fell swoop:

(var sum, var count) = Tally(myValues); // deconstruct result
Console.WriteLine($"Sum: {sum}, count: {count}");  

This way there's no evidence in the code that a tuple ever existed.

Details

That's the general gist of the proposal. Here are a ton of details to think through in the design process.

Struct or class

As mentioned, I propose to make tuple types structs rather than classes, so that no allocation penalty is associated with them. They should be as lightweight as possible.

Arguably, structs can end up being more costly, because assignment copies a bigger value. So if they are assigned a lot more than they are created, then structs would be a bad choice.

In their very motivation, though, tuples are ephemeral. You would use them when the parts are more important than the whole. So the common pattern would be to construct, return and immediately deconstruct them. In this situation structs are clearly preferable.

Structs also have a number of other benefits, which will become obvious in the following.

Mutability

Should tuples be mutable or immutable? The nice thing about them being structs is that the user can choose. If a reference to the tuple is readonly then the tuple is readonly.

Now a local variable cannot be readonly, unless we adopt #115 (which is likely), but that isn't too big of a deal, because locals are only used locally, and so it is easier to stick to an immutable discipline if you so choose.

If tuples are used as fields, then those fields can be readonly if desired.

Value semantics

Structs have built-in value semantics: Equals and GetHashCode are automatically implemented in terms of the struct's fields. This isn't always very efficiently implemented, so we should make sure that the compiler-generated struct does this efficiently where the runtime doesn't.

Tuples as fields

While multiple results may be the most common usage, you can certainly imagine tuples showing up as part of the state of objects. A particular common case might be where generics is involved, and you want to pass a compound of values for one of the type parameters. Think dictionaries with multiple keys and/or multiple values, etc.

Care needs to be taken with mutable structs in the heap: if multiple threads can mutate, tearing can happen.

Conversions

On top of the member-wise conversions implied by target typing, we can certainly allow implicit conversions between tuple types themselves.

Specifically, covariance seems straightforward, because the tuples are value types: As long as each member of the assigned tuple is assignable to the type of the corresponding member of the receiving tuple, things should be good.

You could imagine going a step further, and allowing pointwise conversions between tuples regardless of the member names, as long as the arity and types line up. If you want to "reinterpret" a tuple, why shouldn't you be allowed to? Essentially the view would be that assignment from tuple to tuple is just memberwise assignment by position.

(double sum, long count) weaken = Tally(...); // why not?
(int s, int c) rename = Tally(...) // why not?

Unification across assemblies

One big question is whether tuple types should unify across assemblies. Currently, compiler generated types don't. As a matter of fact, anonymous types are deliberately kept assembly-local by limitations in the language, such as the fact that there's no type syntax for them!

It might seem obvious that there should be unification of tuple types across assemblies - i.e. that (int sum, int count) is the same type when it occurs in assembly A and assembly B. However, given that structs aren't expected to be passed around much, you can certainly imagine them still being useful without that.

Even so, it would probably come as a surprise to developers if there was no interoperability between tuples across assembly boundaries. This may range from having implicit conversions between them, supported by the compiler, to having a true unification supported by the runtime, or implemented with very clever tricks. Such tricks might lead to a less straightforward layout in metadata (such as carrying the tuple member names in separate attributes instead of as actual member names on the generated struct).

This needs further investigation. What would it take to implement tuple unification? Is it worth the price? Are tuples worth doing without it?

Deconstruction and declaration

There's a design issue around whether deconstruction syntax is only for declaring new variables for tuple components, or whether it can be used with existing variables:

(var sum, var count) = Tally(myValues); // deconstruct into fresh variables
(sum, count) = Tally(otherValues); // deconstruct into existing variables?

In other words is the form (_, _, _) = e; a declaration statement, an assignment expression, or something in between?

This discussion intersects meaningfully with #254, declaration expressions.

Relationship with anonymous types

Since tuples would be compiler generated types just like anonymous types are today, it's useful to consider rationalizing the two with each other as much as possible. With tuples being structs and anonymous types being classes, they won't completely unify, but they could be very similar. Specifically, anonymous types could pick up these properties from tuples:

Once in the language, there are additional conveniences that you can imagine adding for tuples.

Tuple members in scope in method body

One (the only?) nice aspect of out parameters is that no returning is needed from the method body - they are just assigned to. For the case where a tuple type occurs as a return type of a method you could imagine a similar shortcut:

public (int sum, int count) Tally(IEnumerable<int> values) 
{
    sum = 0; count = 0;
    foreach (var value in values) { sum += value; count++; }
}

Just like parameters, the names of the tuple are in scope in the method body, and just like out parameters, the only requirement is that they be definitely assigned at the end of the method.

This is taking the parameter-result analogy one step further. However, it would special-case the tuples-for-multiple-returns scenario over other tuple scenarios, and it would also preclude seeing in one place what gets returned.

Splatting

If a method expects n arguments, we could allow a suitable n-tuple to be passed to it. Just like with params arrays, we would first check if there's a method that takes the tuple directly, and otherwise we would try again with the tuple's members as individual arguments:

public double Avg(int sum, int count) => count==0 ? 0 : sum/count;

Console.WriteLine($"Avg: {Avg(Tally(myValues))}");

Here, Tally returns a tuple of type (int sum, int count) that gets splatted to the two arguments to Avg.

Conversely, if a method expects a tuple we could allow it to be called with individual arguments, having the compiler automatically assemble them to a tuple, provided that no overload was applicable to the individual arguments.

I doubt that a method would commonly be declared directly to just take a tuple. But it may be a method on a generic type that gets instantiated with a tuple type:

var list = List<(string name, int age)>();
list.Add("John Doe", 66); // "unsplatting" to a tuple

There are probably a lot of details to figure out with the splatting and unsplatting rules.

gafter commented 9 years ago

There are many ways to skin this cat. Today you return multiple values this way

    T Mode<T>(this IEnumerable<T> data, out int occurrences) { ... }
...
    int occurrences;
    var mostFrequent = arguments.Mode(out occurrences);

The most annoying thing about this is that you do not stay in "expression mode" but have to use statements, which may interrupt the flow of your computation.

With tuples you can return multiple values this way

    (T mode, int occurrences) Mode<T>(this IEnumerable<T> data) { ... }
...
    (string mostFrequent, int occurrences) = arguments.Mode();

but maybe we'd allow you to use that invocation syntax for the existing method in the first example, above. It isn't clear if the last line is a statement form or an expression (technically, expression-statement).

With records and pattern matching you'd do it something like this

    class ModeAndOccurrences<T>(T Mode, int Occurrences);
    ModeAndOccurrences<T> Mode<T>(this IEnumerable<T> data) { ... }
...
    if (arguments.Mode() is ModeAndOccurrences<T>(string mostFrequent, int occurrences)) ...

Perhaps you could use pattern matching with the tuple form

    (T mode, int occurrences) Mode<T>(this IEnumerable<T> data) { ... }
...
    if (arguments.Mode() is (string mostFrequent, int occurrences)) ...
EamonNerbonne commented 9 years ago

TL;DR: Conversions - be careful

There's lots to like here, but there's one specific "why not?" this proposal asks that has a good answer I'd like to provide.

(double sum, long count) weaken = Tally(...); // why not?
(int s, int c) rename = Tally(...) // why not?

In particular that second line is not a good idea. One of the strengths of this proposal is that tuple members are named, and that's excellent for type safety. After all:

(int x, int y) rename = Reposition(...)
(int row, int col) rename = Reposition(...)

Typically, y would be equivalant to row and x to col, yet this code implicitly and accidentally reverses their order. This is a really nasty bug, and it's one that can easily be introduced by a refactoring.

Even the original Tally example suffers from this problem:

(int sum, int count) rename = Tally(...) // why not?
(int count, int sum) rename = Tally(...) // why not?

There's something to be said that the original ordering (sum, count) isn't as good as the ordering (count, sum) since you might view this sum of various powers - 0 is the count, and 1 is the sum.

In any case, once you introduce a rename like this, then any changes to the names in the right-hand-side tuple silently cause semantics errors that aren't detected at compile nor runtime.

For this reason I think that conversions should not be able to implicitly change member names - any name-changing conversions should explicitly need to mention the original name.

iam3yal commented 9 years ago

@EamonNerbonne, good point! ;)

kbirger commented 9 years ago

@eyalesh you made one great point, and it seals my opinion in contradiction.

Understanding. The more "STUFF" we put into a language, the less of the language the average programmer will understand. There is a cost and benefit to adding a feature.

Delegates were a fantastic add, but most people you talk to will not understand how they are implemented, how delegate caching and compilation works, and the slight performance penalty associated with executing a delegate vs a virtual method, vs a non-virtual method.

Delegates let you do things you could not easily do before without making mountains of difficult-to-maintain code (Single-use interface implementations, etc)

Adding something which generates code in order to save you from writing grouping classes seems much lower on the benefit curve, but almost just the same on the cost curve, because you're generating classes, or structs, that will do hidden boxing and unboxing and other non-obvious operations.

multi-paradigm language

C# is an object oriented language with elements from other paradigms in it. Adding bells and whistles does not change the core. The core of the engine is types. I would even go so far to say that it is a TYPE oriented language, where JavaScript is an object oriented language. Python is an object oriented as well, whose type system was added later. We don't need to add every feature from every language into C# just because in some cases it would make our lives more convenient in the short term.

@EamonNerbonne Makes a good point indeed. This problem is easily solved by using a struct. It's explicit, portable, and readable. It makes your code extensible and future-proof. (You can add methods, change it to a class, add extension methods, abstract it, etc). Code evolves.

This sort of feature is cute and it has its uses in interpreted languages where it is organic and mostly "free", but I simply do not see enough value for the cost of adding yet another convenience feature that generates code.

iam3yal commented 9 years ago

@kbirger, I understand the cost but in the same sense this feature can make our life easier!

You see, just because "some" developers don't know what's a delegate not to mention what's a callback or pointer to a function doesn't mean it isn't useful for many others.

You're saying that delegates are great and then continue to say that "Adding something which generates code in order to save you from writing grouping classes seems much lower on the benefit curve" you do realize that a delegate is just a class wrapping a function? without them we could actually achieve exactly the same thing with classes? it just require a bit of extra work.

I don't know anything about hidden Unboxing and Boxing, it depends on many things.

You can't say that the language is object-oriented and then say that it contains elements from other paradigms because it's just like saying it's a multi-paradigm but I'll let you read it from the official documentations.

Python and JavaScript like C# are a multi-paradigm language! C#: https://msdn.microsoft.com/en-us/vstudio/hh341490.aspx Python: https://docs.python.org/2/howto/functional.html JavaScript: https://en.wikipedia.org/wiki/ECMAScript (notice the paradigms on the side? Object-Oriented is not one of them.) whether it's Object-Oriented paradigm is very controversial although ECMAScript 6 is going to fix that, I guess.

Unlike these languages above, Smalltalk is a true Object-Oriented Language in the true sense of the word but in today's world you can't do much with a single paradigm of programming.

A language is either statically typed or dynamically typed language whether it's Object-Oriented is irrelevant to this story because it's a different story so I'm not sure about Type Object Oriented language, never heard this term before.

Again, I think it's a great feature but I guess you have a different option about it. :)

AZBob commented 9 years ago

@EamonNerbonne I think your point is moot because it's no different than a function declaration. When you call a function, the parameters aren't named (usually, though of course you can call functions with named parameters these days), and can be just as easily switched by mistake. For example:

public int MyFunction(int stuff, int otherStuff) { ... }

The call:

MyFunction(var1, var2);

Var1 and Var2 can be switched very easily, and yet people seem to be able to avoid mostly making that mistake so I doubt this would be any different.

I reiterate my request, nay, plead, that this feature, if implemented, NOT name the parameters in the declaration and instead derive the values by POSITION from the return statement within the body of the function. Preferably, without explicitly setting the types of the return values in the function declaration, since they're not necessary, either. Again, for example:

private (,) MyFunction(...) { 
...
   return 23, "Stuff";
}

or, if absolutely necessary, type the return parameters, but leave out the names:

 private int, string MyFunction(...) {
 ...
    return 23, "Stuff";
    // Or, as would most likely be the case
    // return var1, var2;
}

In this case, I don't believe the parens are necessary since the comma will make it obvious what's going on to the parser. However, an argument could be made that parens would be make it more consistent with the function's parameter list. I think this syntax, though, makes it more consistent with a single value return function, which is higher priority, IMO.

IMO, there's absolutely no reason whatsoever to name the return parameters. Further, the call to either of these would be:

var x, y = MyFunction(...);

Types will be inferred from the return statement in the body or, if necessary, from the declaration (if that route is chosen). On the call, I don't think there's any reason for parens, nor any real reason to declare the types (since, as I said, that can be inferred deterministically), so var can be used, and instead of doing var, var, var, a single var should suffice.

As long as this feature is being considered, which is really about easing programmer workload IMO, as much of the syntax as possible should be simple and as consistent as possible with existing constructs. They should strive to eliminate all unnecessary syntax (parens, names, etc.).

iam3yal commented 9 years ago

@AZBob, Yeah, I didn't like the fact that they decided or still deciding whether to go with named tuples but I'd take this over nothing! :)

I really like the following syntax you posted:

 private int, string MyFunction(...) {
    ...
    return 23, "Stuff";
    // Or, as would most likely be the case
    // return var1, var2;
}
taylorjonl commented 9 years ago

@AZBob, this is not valid syntax and is confusing:

var x, y = MyFunction(...);

You can only declare variables in this comma delimited way if they are of the same type, e.g.

int x, y;

The 'var' keyword is only compiler magic so you don't have to type out 'int' since it can be inferred. So unless they start allowing this:

int x, long y;

I think the syntax you provide will be confusing.

I personally don't like this feature in the context of returning values from a function. I would prefer that instead these are two separate features. I like the idea of using compiler magic to make 'out' parameters have a cleaner syntax for returning multiple values from a function, so instead of using this:

int MyFunction(out int v2, out int v3)
{
    v2 = 2;
    v3 = 3;
    return 1;
}
int v1, v2, v3;
v1 = MyFunction(out v2, out v3);

You would use this:

(int, int, int) MyFunction()
{
    return 1, 2, 3;
}
int v1, v2, v3;
v1, v2, v3 = MyFunction();
iam3yal commented 9 years ago

@taylorjonl, obviously it's not a valid syntax, it's a proposal...

A similar syntax exists in Lua and it's very straightforward and not confusing.

They can reuse the var keyword if they wish, I don't think it will ambiguously match something else but they can also decide to just use 'var x = MyFunc(...)' and then have the compiler generate an anonymous type, much like they are planning on doing in this very proposal and then the var keyword would infer to this anonymous type but they decided to go with a different syntax, probably to support named tuples and kill two birds at the same time.

The out/ref parameter was never meant to be used to return multiple values and there are cases that you can't actually use it so abusing a feature to have multiple return values and have some magic done by the compiler is a really bad idea.

taylorjonl commented 9 years ago

@eyalesh, how is it not meant to return multiple values? That is 'out's entire reason for existing. What cases can't you use 'out' to return multiple values?

I understand Lua accepts that syntax but it is no C#ish. I would hate to have this language turn into some combination of every language out there. If this proposal goes forth with the proposed syntax they would have to allow:

int v1, long v2, string v3 = MyFunction(...)

Or they have to say you have to use 'var', which is not a good exception just for this use case in my opinion.

aluanhaddad commented 9 years ago

Personally, I would prefer the syntax

var (x, y) = GetCoordinates();

@AZBob I disagree, I think having the returned tuple's components be named at the declaration site is a very good idea. While you are correct that people rarely make the mistake of passing the arguments to a function in the wrong order, they can always look at the function declaration if there is any ambiguity.

iam3yal commented 9 years ago

@taylorjonl, The question is not about whether you can but whether you should...

I don't get the not C#-ish part... really.

Seems like annotating the variables with the actual return type is counterproductive, tedious and I'd argue that it's even error-prone.

In fact, you know what? I'd go with it and say let the types be there for these who want them but let us have a succinct syntax for multiple return values and var is a great candidate.

'out' exists to remove the burden of initializing the variable before the function call, yes you can abuse it and yes the docs says it's useful to use it to return multiple values.

The reason they added 'ref/out' is mostly for interoperability, it's used heavily in COM and this was probably the main reason for having them.

Out got too many drawbacks, you can't use it with async, can't use it with yield and you can't pass properties directly to it!

Sometimes you just want to pass something and mutate the copy for whatever reason, especially when it comes to value types.

Last but not least out is really ugly even though there was/is a proposal to make it nicer #254.

glen-84 commented 9 years ago

Does this proposal cover array destructuring? (examples here)

I also like the simplified syntax for array literals.

ghost commented 9 years ago

Could either tuples or anonymous classes cast to property-only interfaces as in TypeScript?

This would allow to get rid of unnecessary classes in many cases.

kbirger commented 9 years ago

It seems like this would require you to write conversions through operator overloading. It seems unclean to have to do a runtime check on whether or not the target type is "property-only", moreover you would still have to execute the constructor, so you wouldn't be able to guarantee that two objects created via ctor(x,y) are identical.

However, if we think of this not as casting but as creation such that tuple (x,y) is cast to type Foo where Foo has constructor Foo(x,y), then perhaps the conversion could be performed that way, and you would simply get the standard "Type Foo does not contain a constructor with specified arguments" message if you try to cast the wrong tuple to Foo.

Though my gut says that if you're considering this, then you should just go with real classes anyway. (See points above)

kevinhector commented 9 years ago

Possibly a first-time response, but a full specification of the return type in the signature makes it hard to read. Just to throw something out there, maybe generic constraint syntax could be employed, really just as an aliasing mechanism. For example:

public R Tally(IEnumerable<int> values)
    where R : { int sum, int count }
{
    var s = 0; var c = 0;
    foreach (var value in values) { s += value; c++; }
    return new R { sum = s, count = c };
}

Maybe the language designers would prefer a reserved symbol for the return type to distinguish from type parameters (or something like R{}) but this may not be necessary:

public R FirstLast<T>(IEnumerable<T> source)
    where R : { T first, T last }
{
    ...
}

But most importantly we need to settle on the correct pronunciation of Tuple. I am firmly in the "oo" camp.

HaloFour commented 9 years ago

@kevinhector

What, you don't pronounce it "toe-play" (ˈtō-ˌplā)? You weirdo.

Mixing it with the generic syntax, especially when it's not related to generics, doesn't seem right to me. I'd rather have the expanded (and frankly unattractive) syntax because maybe that will dissuade people from using them haphazardly instead of defining proper public types.

gafter commented 9 years ago

When there are two I call it a Twople.

MgSam commented 9 years ago

I've always called it "row-like thingy".

kevinhector commented 9 years ago

@HaloFour

If it makes you feel any better we can express this in a new meta syntax which may enlist in the genericness of the method:

public ? BiggestSmallest<T>(IEnumerable<T> source)
    returns : { T biggest, T smallest }
    where T : IComparable<T>
{
    ...
}

Not sure this makes me feel any better, but it is certainly implementable in Roslyn

kevinhector commented 9 years ago

@gafter

When there are three you call it a Fewple?

gafter commented 9 years ago

@kevinhector Twople, Threeple, Fourple, etc.

glen-84 commented 9 years ago

:-1: The return types should be in the same position that they are now. I think that the proposed syntax (from Mads) is much cleaner.

BTW, does anyone know the answer to my question?

gafter commented 9 years ago

@glen-84 No, nobody knows the answer to your question.

GeirGrusom commented 9 years ago

@gafter more than four should be called 'alotple'.

orthoxerox commented 9 years ago

@GeirGrusom 'maniple', how could there be a different option?

On a more serious note, will advanced sugared tuples remove the need for record classes?

HaloFour commented 9 years ago

Record classes would have additional members generated to support pattern matching, right?

I know Oxford would never accept this word, but maybe "multi-ple"?

gafter commented 9 years ago

@orthoxerox Re "will advanced sugared tuples remove the need for record classes"

The tuples we envision would be structs. Records can be structs or classes, can be mutable, can have additional members and implement interfaces, and have a name. The record feature may fulfill some of the use cases that motivated tuples, but there are use cases for either that the other doesn't help with.

svick commented 9 years ago

One thing that would be nice is to allow tuple deconstruction in LINQ queries. That way, we might be able to write code like:

from (x, i) in collection.WithIndexes()
…

Or:

from item in items
let (sum, count) = Tally(item.Values)
…
bleroy commented 9 years ago

Interesting to compare this proposal with the equivalent EcmaScript 6 feature (and other languages, goes without saying), where deconstruction, for example, extends the syntax for array and object literals:

var [m, d, y] = [3, 14, 1977];
function today() { return { d: 6, m: 2, y: 2013 }; }
var { m: month, y: year } = today(); // month = 2, year = 2013

http://ariya.ofilabs.com/2013/02/es6-and-destructuring-assignment.html

So not exactly tuples here, but similar deconstruction concepts using the existing basic JS types.

Notice how the second case doesn't suffer from the order ambiguity that @EamonNerbonne mentioned above.

Also of notice: the "var" keyword is outside of the declaration. In C#, it's of course a little different as there are cases where you'll want to be explicit about the individual types of the members of the tuple.

alrz commented 9 years ago

It's better to make names optional and just define types, something like this (int,int) and when you return some values you don't have to repeat all that and just return (5,6); and names will be given only when deconstruction occurs (var item1, var item2) = ... . as you can see in the other functional languages.

cordasfilip commented 9 years ago

I have just one question what about delegates, expression and Reflection? Could I do?

Func<Tuple<int,int> func = Tally;

How can i get the names of return values from MethodInfo

paulomorgado commented 9 years ago

One way would be an attribute attached to the return of the method.

gafter commented 9 years ago

@cordasfilip Our current thinking is that we'd convey tuple names using attributes, similar to the scheme we currently use to distinguish object from dynamic.

paulomorgado commented 9 years ago

:+1:

dsaf commented 9 years ago

Reflection support is awesome, but will it be possible to get the compile-time type information:

var tupleType = typeof((int sum, int count));

It could be applicable in custom attributes:

[MyDefaultValue(typeof((int sum, int count)), "10,15")]
...

I also feel value in cross-assembly unification (and of course field naming). One of the scenarios for me would be decoupled flexible messaging (Erlang-style):

public class Actor
{
    public void Handle(Message message)
    {
        // Could be improved by pattern matching I guess...
        // No need for sharing the message type - complete decoupling:
        if (message.Name == "UpdatePriceAndLabel" && message.Value.GetType() == typeof((decimal price, string label)))
        {
            // logic
        }
    }
}

public class Message { public string Name {get; set;} public object Value {get; set;} }
gafter commented 9 years ago

@dsaf

The (runtime) type of (int sum, int count) is the type (int, int).

If you have a tuple value

object t = (1,21m, "foo");

You'll be able to switch on the runtime type of its elements

if (t is (decimal price, string label)) ...

This composes with other pattern forms. I believe the code you want will be something like

if (message is {Name is "UpdatePriceAndLabel", Value is (decimal price, string label)}) ...
dsaf commented 9 years ago

@gafter that makes sense, thank you.

Our current thinking is that we'd convey tuple names using attributes, similar to the scheme we currently use to distinguish object from dynamic.

Will those attributes be preserved into runtime, so that it would be possible to explore them via reflection?

paulomorgado commented 9 years ago

How else would tooling pick them up if they weren't on the metadata?

EamonNerbonne commented 9 years ago

@paulomorgado they're going to have to be in some metadata somewhere, but the question is whether that'll be in complile-time metadata (i.e. roslyn) or runtime (i.e. System.Reflection). If it's in reflection metadata, that means that tuples that differ only in names will need to differ in runtime types, which means that the runtime type of (int sum, int count) cannot be just (int, int) to avoid clashing with (int count, int sum).

paulomorgado commented 9 years ago

I'm failing to understand your point, @EamonNerbonne. Can you give an example of cross assembly compile-time metadata?

EamonNerbonne commented 9 years ago

@paulomorgado it's an asinine example, but plain File.ReadAllBytes(path-to-assembly) is available to the compiler, but not (necessarily) available to reflection. The compiler can choose to embed all kinds of data there, even if it's not exposed to reflection. It could (not saying that it should!) embed tuple type names there, or whatever information it wants to.

One practical example is for instance the IL-stream: you cannot read the IL of a method using reflection, to the best of my knowledge, which has the (to me) unfortunate downside that you cannot reliably decompile a delegate. (I'd love that ability for ExpressionToCode). The compiler can (in the unlikely event that it needs to).

jmaine commented 9 years ago

Hum....

First make tuples implement a series of interfaces like ITuple<T1>, ITuple<T1, T2>, etc.

Next add anonymous structs like this.

new struct {
      Name1 = t1,
      Name2 = t2
}

Make anonymous structs and classes implement ITuple<...> implementing Item1, Item2, etc in order of declaration. The Name1 (as in the example above) is used to implement Item1, etc.

Next make new value type tuples named ValueTuple<T1>, ValueTuple<T2> etc and make implement the same interfaces also.

With this, we can have either named or anonymous tuples that are either reference types or value types. Second, named and anonymous tuples would work together easier.

whoisj commented 9 years ago

The (runtime) type of (int sum, int count) is the type (int, int).

This brings up another question, will the compiler be smart enough to know that my (int, int) is identical to your (int, int) and not provide multiple definitions for them or will we have to rely on runtime checking?

alrz commented 9 years ago

@whoisj I just don't understand why should it has names for tuple items at all. it has to ignore them in the deconstruction. it's not tuple anymore, it's more of a record type. there's something that I'm missing?

whoisj commented 9 years ago

@alrz

Per @gafter comment, it seems they are discreet types. This a typeof() should result in a comparable value. No?

alrz commented 9 years ago

@whoisj I didn't quite understand what you meant by "comparable", but let's say they should be identical. the ability to specify names for members is so confusing I cannot stand!

dsaf commented 9 years ago

@whoisj Doesn't Unification across assemblies section cover this?

dsaf commented 9 years ago

@alrz

...the ability to specify names for members is so confusing I cannot stand!..

This feature will not be obligatory to use.

gafter commented 9 years ago

@dsaf If a feature is really confusing (and I'm not saying this one is), suggesting that people should just avoid it doesn't really address the concern. It will be obligatory to read all of the features used in code that I am reading, whether I elect to use the feature or not. We will avoid adding to the language features that we believe are too confusing.

We believe we may specify this feature in a way that isn't too confusing.