proposal: spec: tuples as sugar for structs

jimmyfrasche commented 12 months ago

Updates:

~~only allow unpack when all fields are exported~~
unpack skips unexported fields of structs from different packages
include a vet rule
explicitly include a change to Go 1 compat agreement so that unpack is treated the same as unkeyed struct literals

This proposal adds basic tuples to Go with one piece of sugar and two builtins.

The sugar is for struct(T0, T1, …, Tn) to be shorthand for a struct type with n fields where the ith field has type Ti and is named fmt.Sprintf("F%d", i). For example, struct(int, string) is a more compact way to write struct { F0 int; F1 string }.

This gives us a tuple type notation for and answers all questions about how they behave (as the desugared struct they are equivalent to). By naming the fields F0 and so on, this both provides accessors for the individual elements and states that all tuple fields are exported.

The variadic pack builtin returns an anonymous tuple, with the appropriate types and values from its arguments. In particular, by ordinary function call rules, this allows conversion of a function with multiple returns into a tuple. So pack(1, "x") is equivalent to struct(int, string){1, "x"} and, given func f() (int, error), the statement t := pack(f()) produces the same value for t as the below:

n, err := f()
t := struct(int, error){n, err}

The unpack builtin takes any struct value and returns all of fields in the order of their definition, skipping _ fields and unexported fields from a different package. (This has to be somewhat more generally defined as tuples aren't a separate concept in the language under this proposal.) This is always the inverse of pack. Example:

// in goroutine 1
c <- pack(cmd_repeat, n)

// in goroutine 2
cmd, payload := unpack(<-c)

The struct() sugar let's us write pairs, triples, and so on for values of mixed types without having to worry about names. The pack and unpack builtins make it easier to produce and consume these values.

No changes are needed to the public API of reflect or go/types to handle tuples as they're just structs, though helpers to determine if a given struct is "tuple-y" may be useful. go/ast would need a flag in StructType noting when a struct used the tuple syntax but as long as the implicit field names are explicitly added by the parser. The only place this would be needed is for converting an AST back to a string.

The only potential danger here is unpack. If it's used on a non-tuple struct type from a different package it would be a breaking change for that package to add an additional exported field. Go 1 compat should be updated to say that this is a acceptable just as it says that adding a field that breaks an unkeyed struct literal is acceptable. Additionally, a go vet check should be added that limits unpack to structs with exclusively "F0", "F1", …, "Fn" field names. This can be relaxed at a later time.

This is a polished version of an earlier comment of mine: https://github.com/golang/go/issues/33080#issuecomment-612543798 In the years since I've written many one-off types that could have just been tuples and experimented with generics+code generation to fill in the gap. There have been multiple calls for tuples in random threads here and there and a few proposals:

61920
32941
33080 (not tuples per se but related)

earthboundkid commented 12 months ago

I'm not sure the struct() sugar is necessary, but the pack and unpack builtins are brilliant.

What if instead of struct(), we said type MyRecord record{ string, error } was sugar for type MyRecord struct{ F1 string; F2 error }? ch := make(chan record{ string, error }) reads better to me than ch := make(chan struct( string, error )).

jimmyfrasche commented 12 months ago

I'd be fine with that but honestly I prefer struct because

emphasizes that it is still just a struct
no new keywords

jimmyfrasche commented 12 months ago

If there is a new keyword it should probably be tuple as record generally means things more like structs

DeedleFake commented 12 months ago

The only danger here is unpack. If it's used on an exported non-tuple struct type from a different package it would be backwards incompatible for that package to add an additional exported field. I think the only way around that is to say don't do that and not consider that an incompatible change. It is safe to unpack a value of type that is meant to be a tuple regardless of its source as the length of a tuple is necessarily part of its API.

I think this makes sense. It's similar to the existing situation with struct literals initialized without field names, i.e. v := Example{2, "something"}. That's technically legal, but it's not safe in terms of backwards compatibility and should be used with care.

One question I have though is whether it would be valid to directly use pack() with the results of a function call. For example, something like

func F1() int { return 1 }
func F2() (int, string) { return 2, "something" }

func main() {
  v := pack(F1()) // Allowed?
  v = pack(F1()) // Still allowed? Same type?

  v2 := pack(F2()) // Allowed?
}

jimmyfrasche commented 12 months ago

Yes, unpack is basically the mirror image of unkeyed struct initialization.

@DeedleFake I copied your example and put the examples in inline:

func F1() int { return 1 }
func F2() (int, string) { return 2, "something" }

func main() {
  v := pack(F1()) // v is the same as if the call were pack(1) so: struct(int){1}
  v = pack(F1()) // this is still allowed and is the same type

  v2 := pack(F2()) // This is allowed by func call rules and is the same as pack(2, "something")
}

earthboundkid commented 12 months ago

One question I have though is whether it would be valid to directly use pack() with the results of a function call.

I think it should work like the current system where you can do f(a()) if the arguments of f match the return types of a. So you could do:

func Foo() (a string, b int, c error)
func Bar(a string, b int, c error)

packed := pack(Foo())
// ...
Bar(unpack(packed()))

In Python, you need *arg, **kwargs to "unpack" arguments, but since Go is strongly typed, that shouldn't be necessary, and it should just always do the right thing.

One question is if this should work (I think it should):

func Foo() []string
func Bar(...string)

packed := pack(Foo())
// ...
Bar(unpack(packed()))

A vararg is just a final slice, so unpack should be able to just transparently unpack into it.

I am less sure if this should work:

func Foo() (a, b, c string)
func Bar(...string)

packed := pack(Foo())
// ...
Bar(unpack(packed()))

I think probably not, but it's a harder call.

jimmyfrasche commented 12 months ago

@carlmjohnson You are correct about the return type matching (though you have an extra () in your example code).

Neither of those last two would work, though: changes to varargs would need to be a separate proposal.

earthboundkid commented 12 months ago

This works now, so I guess the unpack version of it should too:

func Foo() (a, b, c string)
func Bar(...string)

Bar(Foo())

Varadics are different issue and I guess can be handled separately.

jimmyfrasche commented 12 months ago

Interesting. I could have sworn that was an error! Yes that would work.

And I also misread your first example:

func Foo() []string
func Bar(...string)

Bar(unpack(pack(Foo())))

This will not work. It's the same as Bar(Foo()) which needs to be Bar(Foo()...) so you'd need Bar(unpack(pack(Foo()))...) which works. It's a little confusing because the tuple is type struct([]string) so unpack only has a single return which is a slice

jaloren commented 11 months ago

In some languages with tuples, a tuple is a type that can be passed to a function and thus part of a function signature. In this design, that wouldn't be supported is that right?

I think one common use will be dealing with configuration functions that take more than 4 arguments. Currently, my rule of thumb is that any function that takes more than 4 arguments should take a config struct instead. With tuples, I could instead see APIs designed and used like this:

// package widget
func Setup(argOne int, argTwo string, argThree bool, argFour float64, argFive bool) error {
 // impl here
}
---
// main package

import(
  "widget"
)
func main(){
input := pack(argOne,argTwo,argThree,argFour,argFive)
if err := widget.Setup(unpack(input)); err != nil {
 panic(err)
}
}

I am unsure if that's a good idea or not but I think the temptation will be high because its the path of least resistance. Defining a struct is going to be much heavier by comparison and who wants to do that just to handle one function. If we do think that's a good idea, then being apple to pass the tuple as a type would be nice.

thediveo commented 11 months ago

How does struct(T0, T1) work across packages when there is no exported type Foo struct(T0, T1) that gets used? Is this out of scope on purpose? Pardon me if I overlooked the corresponding passage in the draft above.

earthboundkid commented 11 months ago

How does struct(T0, T1) work across packages when there is no exported type Foo struct(T0, T1) that gets used? Is this out of scope on purpose? Pardon me if I overlooked the corresponding passage in the draft above.

~~IIUC, it would follow the usual rules about type T struct{ /**/ } vs. type T = struct{ /**/ }, where you can pass unnamed struct types between packages without needing a central definition.~~ I take it back, that won't work because each anonymous struct type is considered different.

jimmyfrasche commented 11 months ago

@thediveo & @carlmjohnson

https://go.dev/ref/spec#Type_identity

Two struct types are identical if they have the same sequence of fields, and if corresponding fields have the same names, and identical types, and identical tags. Non-exported field names from different packages are always different.

By construction there are no tags and the field names are automated so as long as they have the same number of types in the same order they're the same

proof!: https://go.dev/play/p/Ek8vIDKndw-

this also works if one (but not both) of the packages define the type: https://go.dev/play/p/pUGiqhUxzQB

earthboundkid commented 11 months ago

Okay. I thought it would work, but then I confused myself. This works because all the fields are public, but this does not because the fields are private.

jimmyfrasche commented 11 months ago

@jaloren

In some languages with tuples, a tuple is a type that can be passed to a function and thus part of a function signature. In this design, that wouldn't be supported is that right?

You could include a tuple in a function signature like:

func F(n int, tup struct(Point, Color))

It's an ordinary type (specifically a struct!) so you can use it however or whenever you would any other type.

I think one common use will be dealing with configuration functions that take more than 4 arguments.

I do not think that will be common at all. As your own example shows it buys nothing over not doing it other than having to include an extra pack and unpack.

Using a defined struct is immensely superior for the use case specifically because you can name the fields and easily omit irrelevant ones by using a keyed literal.

The major use case for tuples is having some types that you need to bundle together but there's no real need for anything other than that. If you've ever written a type with no methods like

type locationWeightPair {
  L location
  W weight
}

just so you could use it as a map key or throw it down a channel, you could just use struct(location, weight) under this proposal.

DeedleFake commented 11 months ago

I really like this proposal. The more I think about it the nicer it seems. I'm trying to figure out if it could help with two-value iterator problem over in #61405, and I think it probably could, but only partially. You could do something like

// t is a struct(string, error)
for t := range tupleSeq {
  v, err := unpack(t) // Needs an extra line, but nicer than needing to manually unpack a whole struct.
  // ...
}

The ability of unpack() to work with non-tuple struct types as well should make the above more straightforward for some cases, too, but that backwards compatibility complication might cause a few problems there.

jimmyfrasche commented 11 months ago

It would only help with iterators if range auto-expanded tuples, which is what, for example, Python does. Since tuples aren't a separate kind of type in this proposal and there doesn't seem to be any interest in raising the number of loop variables past two so I don't see that happening under this or any other proposal, realistically.

You could write generic functions to convert a 2-iter into a pair and vice versa. That may be useful. The xiter funcs that are supersets of what is commonly called zip and zipLongest define types that are basically tuples.

The ability of unpack() to work with non-tuple struct types as well should make the above more straightforward for some cases, too, but that backwards compatibility complication might cause a few problems there.

That's more necessary than it is useful. I think it would make sense and be fine for some types like image.Point which are essentially just a pair with benefits. For others, to reiterate, it's just the same problem with using an unkeyed literal but in reverse. (If this were on paper I'd triple underline that bit.)

jimmyfrasche commented 11 months ago

Go already has tuples in the special case where the type happens to be the same for all the items: arrays. For example, [2]int and struct(int, int) are both capable of containing the same amount of information as the other.

Given that, I think it would make sense to expand unpack to work for arrays as well. It's not fundamental to the proposal and it could be added later, so I'm not going to include it for now, but something to consider.

leaxoy commented 11 months ago

I think there is no need add new keyword or functions, just use () to pack tuple in the right hand side and unpack in the left hand side.

For example:

a := 100
b := "string"
c := []int{1,2,3}

// pack
t := (a, b, c)

// and unpack

(x, y, z) := t
// or () can be omitted to
x, y, z := t // same as current syntax, no need introduce extra complexity
// so x is 100, y is "string" and z is []int{1,2,3}

jimmyfrasche commented 11 months ago

@leaxoy There are no new keywords in this proposal: just use of one keyword in a new context and two new predeclared identifiers. I don't think just using () can be made to work either syntactically or philosophically. Perhaps I am wrong but I'm not especially interested in that possibility myself as I like the explicit pack and unpack and think they fit with the language better.

septemhill commented 11 months ago

Do we allow tuple as a field in a struct ?

type TupleInside struct{
   FieldOne string
   FieldTwo int
   FieldThree struct(int, string, float64)
}

If we do, how do we marshal/unmarshal the tuple case ?

earthboundkid commented 11 months ago

Do we allow tuple as a field in a struct ?

I think yes, because it’s just a struct with some sugar for the declaration.

If we do, how do we marshal/unmarshal the tuple case ?

When you unpack a struct containing a tuple, the target variable would be a struct of the appropriate type. Just like if you have a, b, c := f() and c is a struct.

DeedleFake commented 11 months ago

@septemhill

The code in your comment is 100% equivalent to

type TupleInside struct {
  FieldOne String
  FieldTwo int
  FieldThree struct {
    F0 int
    F1 string
    F2 float64
  }
}

It's just syntax sugar.

septemhill commented 11 months ago

@DeedleFake

Sure, I understand it's just syntax sugar.

For example, if we want to marshal the TupleInside to json

type TupleInside struct {
   FieldOne string                         `json:"field_one"`
   FieldTwo int                            `json:"field_two"
   FieldThree struct(int, string, float64) `json:"field_three"`
}

After de-sugar and marshalling, we would get json as following:

{
  "field_one": "field_one",
  "field_two": 123,
  "field_three": {
    "F0": 234,
    "F1": "f1",
    "F2": 345.345
  }
}

So, that means we cannot customize the tag name for each field in struct(int, string, float64)? It would always be F0, F1 and F2.

Please correct me if I got something wrong, thanks.

jimmyfrasche commented 11 months ago

That is correct.

apparentlymart commented 11 months ago

Yes, it is true that struct tags are not a part of this proposal, and so I expect most folks will want to avoid using tuple-like structs in types intended for reflection-based marshalling.

I don't see that as a significant problem, though. Not all types are useful in JSON serialisation, and that's okay. If you are doing JSON serialisation then you will choose your types with that in mind.

jimmyfrasche commented 11 months ago

There's nothing that would prevent adding tags after the types that I can see, but I don't see a lot of reason to add it at this point. I'm not sure why you would both want the version of structs where you don't have to choose the names and want to specify the names. You can just use a regular struct.

jba commented 11 months ago

The only danger here is unpack. If it's used on an exported non-tuple struct type from a different package it would be backwards incompatible for that package to add an additional exported field.

This is my biggest concern with this proposal. Yes, it's similar to unkeyed struct literals, but more problematic because, first, it's more tempting to use, and second, writing a vet check for it is harder. The unkeyed-struct-literal vet check just needs to check that the package of the type differs from the package of the literal, but I think that would be too strict for many reasonable uses of unpack.

So two suggestions:

Define unpack to fail at compile time if the struct has any unexported fields. That won't interfere with the intended use case, and it lets type authors protect against this backwards-compatibility issue in the same way they would for unkeyed struct literals.
Add a vet check. The hard part is coming up with one that captures the idea that the type author intended it to be a tuple type. Maybe one simple idea is that the type being unpacked has no explicit definition: it is only defined indirectly by calls to pack.

thediveo commented 11 months ago

The only danger here is unpack. If it's used on an exported non-tuple struct type from a different package it would be backwards incompatible for that package to add an additional exported field.

This is my biggest concern with this proposal.

I actually like that this will force downstream unpackers to adapt their code, it's a safety static check. unpack on tuples has a strict contract, or do I mistake the idea of unpack?

apparentlymart commented 11 months ago

My understanding of the concern with unpack is that it would be valid to write something like this:

type Foo struct {
    Name string
}

foo := Foo{Name: "Emma"}
name := unpack(foo)

...and have name be "Emma", because it just destructured the fields in order.

Now if Foo were in a separate package and were to later be modified like this:

type Foo struct {
    Name string
    Age  int
}

...or like this:

type Foo struct {
    Name string
    age  int
}

...then both of them are different kinds of breaking change for the caller of unpack. In the former case, the result arity is now wrong -- two results instead of one -- and in the latter case presumably unpack would need to be forbidden altogether because the caller in another package is not allowed to depend on the presence of that unexported age field.

To me this feels more like a "just don't do that, then" sort of situation, but I do agree that if it's possible to have a compile-time check or lint for it then it'd be worth doing so. Making unpack only accept anonymous struct types and not named types whose underlying type is a struct type might be another reasonable heuristic, in addition to those already discussed above. (unpack(struct { Name string }{"Foo"}) would still be valid under that rule, but it seems relatively harmless to allow that since anonymous struct types are already understood to be represented solely by their members, and thus adding a new member is a potential breaking change.)

earthboundkid commented 11 months ago

I agree that unpack should not work for types with unexported fields. It would just lead to confusion.

I don't think unpacking across packages is necessarily a problem. It's the same as how this code will fail if foo.Foo is changed:

package foo

type Foo struct {
    A string
    B string
}

// elsewhere
import "foo"

type myFoo struct {
    A string
    B string
}

var _ = myFoo(foo.Foo{}) // breaks if foo.Foo ever changes without myFoo changing

earthboundkid commented 11 months ago

Would pack work with "void" functions?

func foo() {}

v := pack(foo()) // v is struct{}? Or compile time error?

jimmyfrasche commented 11 months ago

@jba yeah unpack is the hard part.

I don't have a problem with saying all fields must be exported. That seems like a reasonable rule and it could always be relaxed later if it turns out to be too aggressive. I don't imagine there would ever be a need for it to be relaxed.

The simplest vet check would be to only let it be used if all the field names follow the "F%d" naming convention. Another simple one would be to only allow it for types defined in the same module. Both of these fail the image.Point test, which, as I've stated earlier, is an example of a non-tuple that's completely valid to unpack.

I'm not sure how tempting it would be to misuse in practice, though. I can't really think of any struct you'd want to unpack that wasn't basically a tuple. It would get unwieldy quickly as the number of struct fields grows so even misuses would probably be on structs with, say, 6 or fewer fields and even then the majority would likely be 4 or fewer.

If that struct is under your control it's fine. If it's not, and you take on the burden of accepting the possibility of a breaking change, your code will fail to compile pretty early on. The only case where it could go undetected is if two exported fields of the same type were transposed.

If we go with the "no unexported fields are allowed" rule then anyone who feels really strongly can add a zero-sized unexported field and most cases where it would get iffy are already illegal.

It is definitely a theoretical danger but I'm not sure how dangerous it is practically.

@apparentlysmart I don't think named type is a good heuristic because, though the vast majority of the usecases are for leaving the type anonymous, you could do type stuff struct(a, b, c) even if it's just to avoid repeating it all over the place or to add a Less method so it can be stored in a generic container

@carlmjohnson you can't invoke a function with the result of a "void" function so it's already not allowed. Although now that you mention it I suppose you could use pack() as a shorter version struct{}{}, bonus!

jimmyfrasche commented 11 months ago

@septemhill one more thought about json marshalling. There could be a separate proposal to allow any struct to serialize as a list of values. This is how most other languages handle tuples so it could help interop.

That would allow your example to be:

type TupleInside struct {
   FieldOne string                         `json:"field_one"`
   FieldTwo int                            `json:"field_two"
   FieldThree struct(int, string, float64) `json:"field_three,list"`
}

{
  "field_one": "field_one",
  "field_two": 123,
  "field_three": [234, "f1", 345.345]
}

andig commented 11 months ago

The struct() sugar let's us write pairs, triples, and so on for values of mixed types without having to worry about names.

Is that the (sole) justification for these struct-tuples that warrants the language change?

jimmyfrasche commented 11 months ago

@andig that's not a justification so much as its purpose/usecase.

Some justifications for tuples in general

having to write tiny structs just to bundle a few values together is annoying
having to read tiny struct definitions is at best annoying and at worst confusing. Naming a type is an important thing so you expect the definition to be enlightening but it turns out, no, it's just there so two values can be sent down a channel at once or store a pair of items in a collection. With a tuple syntax you can say "ah it's just a tuple" and move on to important details without the false detour
naming things is hard and these let you skip that part if it's not important
it's awkward to work with unnamed structs currently so you're forced to name the type and the fields (see previous points)
pack/unpack make it easy to package and unpackage multiple return values
incredibly common language feature so new users may expect it to exist in some form or just have to deal with it when interoperating with other languages

Some justifications for this proposal in particular

no new keywords
no new kinds of types
explicit but still concise
if you send a tuple to a package written before tuples and it uses reflection everything still works since it's just a struct
any question you have about how tuples work can be answered by desugaring it into a regular struct and then asking the question again

If anyone has others to add, this is the time and place!

dsnet commented 11 months ago

I like the idea, but I'm bothered by the inability to unambiguously determine whether something was intended to be a tuple from Go reflection. There's an intent that visible in the Go source code that cannot be observed by reflection.

Consider the following:

type S struct {
    F0 int
    F1 string    
}

type T struct(int, string)

From the perspective of Go reflection, both S and T look the same, but are just named differently. It's unclear whether S is definitely meant to be a tuple or not, while T obviously is based on the construction.

The problem is that you can write S today, and it will be JSON serialized as {"F0":0,"F1":""}.

I don't see how we can backwards compatibly JSON serialize T as [0,""] as that would be changing the behavior of S as well. What we need is some type of reflection-based marker to indicate that some struct type with sequential F%d fields was truly intended to be a tuple. It could be a magic method, a new Go kind, or whatever.

To be clear, I want to support serializing of tuples in JSON as I've dealt with REST-based APIs that do this sort of stuff and it's currently a pain to handle it well in Go. This proposal is almost there in terms of addressing this problem.

Regarding https://github.com/golang/go/issues/63221#issuecomment-1740075809 by @septemhill

So, that means we cannot customize the tag name for each field in struct(int, string, float64)? It would always be F0, F1 and F2.

I don't think we should be able to specify struct tag names for a tuple. The purpose of a tuple is a lightweight sequence of strongly typed values of disjoint types. The moment someone wants to customize the names of anything, I'd argue that's what a regular struct is for.

zephyrtronium commented 11 months ago

I don't see how we can backwards compatibly JSON serialize T as [0,""] as that would be changing the behavior of S as well. What we need is some type of reflection-based marker to indicate that some struct type with sequential F%d fields was truly intended to be a tuple. It could be a magic method, a new Go kind, or whatever.

My interpretation of https://github.com/golang/go/issues/63221#issuecomment-1741544805 in terms of the approach of go-json-experiment was that a format:list option in the json tag would cause the encoder to marshal a struct field as a list of its exported fields in source order. Then it is independent of whether the struct is concretely written as a tuple, and the benefit is available to non-tuple structs as well.

dsnet commented 11 months ago

The downside of a format:list tag is that it doesn't work well for higher orders of variance such as:

[]struct(int, string)
map[string]struct(int, string)

We could use format:list for a top-level struct(...), but anything beyond that requires a complicated format:... expression, which we're not currently planning on supporting.

In this situation, the intent is there in the code, so it seems like we should just bridge that gap into Go reflection. That would avoid any annotation in the struct field tags.

earthboundkid commented 11 months ago

Perhaps the solution is that struct(a, b, c) is sugar for

struct {
    F1 a `tuple`
    F2 b `tuple`
    F3 c `tuple`
}

So it's still just a normal struct, but it comes with something for reflect to see.

PS I think the field names should start with F1, not F0.

dsnet commented 11 months ago

If we had the concept of type tags (i.e., basically struct field tags, but for a type declaration), we could do:

struct {
    F1 a
    F2 b
    F3 c
} `tuple`

A type tag would provide other application-specific utility, but the existence of this feature would be beyond this proposal, but may be worth filing separately.

jimmyfrasche commented 11 months ago

@zephyrtronium your interpretation is 100% correct :dart:

@dsnet

I like the idea, but I'm bothered by the inability to unambiguously determine whether something was intended to be a tuple from Go reflection. There's an intent that visible in the Go source code that cannot be observed by reflection.

reflection couldn't tell if it was written as struct() but it could tell if it could have been written as a struct() if:

it's a struct
each field is named "F" followed by a number that is the same as its position

That's the best you're going to get without creating a new kind of type entirely. The types are identical regardless of the syntax used. You shouldn't be able to tell them apart.

If there were some tags, even ones at the type level that you can't do currently [:+1: to that, btw] someone could just as easily add them to a faux tuple and now we're back in the same position. In fact, if you were transitioning to/from tuples and ordinary structs you'd have to add those in to keep stuff from working so you'd still be able to create a non-tuple that reflects the same. There's not much gained.

There is a good part to a struct() jsonifying as a regular struct: it encodes/decodes the same even in programs compiled with an older versions of Go that don't have the sugar yet.

I can definitely see a desire to automatically encode them as lists, especially when part of a composite. That will have to be a nongoal of this proposal.

Looking at the stated problem, and the larger problem discussed in the video, maybe there could be some mechanism where you tag a field with an identifier like json:"plan(X)" and have a map of identifiers to more complex options. That would let you say "look a lot's going on here, too much to write in a tag, refer to the plan named X for how to handle it" and that plan somehow contains the necessary information.

Regardless, since tuples are just structs anything would apply to structs in general even if this proposal is rejected so whatever the solution would be better as a separate proposal or proposals.

jimmyfrasche commented 11 months ago

@carlmjohnson I'm not too concerned about 1 vs 0 indexing the field names. 1 is somewhat more traditional for tuples but that's not universal and pretty much everything in the language and most libraries are 0 indexed so it seems like it's a better fit overall. I do like that [2]int and struct(int, int) would be indexed the same. Ultimately it doesn't matter too much since there aren't a lot of cases where you'd actually be using the field names. It's important that they're there but it's not important what they are. They could be numbered "Fa", "Fb", …, "Fz", "Faa", "Fbb", … and it wouldn't really make much of a difference

dsnet commented 11 months ago

If there were some tags, even ones at the type level that you can't do currently ... There's not much gained.

There's still something to gain. If type tags were released at the same time as this feature, then we know that there couldn't possibly be any false positives. I'm not worried about people adding the tuple tag after we give meaning to it, but I'm worried about false positives with existing code.

reflection couldn't tell if it was written as struct() but it could tell if it could have been written as a struct() if:

it's a struct

each field is named "F" followed by a number that is the same as its position

At the end of the day, my concern ultimately comes down to changing behavior of existing code. Here's some hard data based on analyzing all Go source code on the module proxy as of 2023-07-01:

There are a total of ~30.7M struct types
- ~23.6M are named struct types
- ~7.1M are anonymous struct types
There are 153 struct types that match the F%d convention
- 144 are testing related
- 9 are non-testing related
- 2 actual unique types as many seem to be forks
  - 1 contains no struct field tags
  - 1 contains struct field tags with json

The testing only ones are uninteresting since it seems that the author just didn't want to come up with actual field names. The non-test ones are more interesting. One of them is obviously used with JSON given the json tags, but we can filter out those since struct(...) wouldn't produce struct field tags. I didn't analyze the AmountRange type to see if it ends up being introspected by Go reflection for use with JSON, XML, YAML, etc.

It seems that false positives are probably going to be very rare.

tuples.txt

dsnet commented 11 months ago

I think we should adjust the definition of unpack from:

The unpack builtin takes any struct value and returns all of its exported fields in the order of their definition.

to:

The unpack builtin takes any struct value with only exported fields and returns all of the fields in the order of their definition.

Today, you can prevent order-dependent and size-dependent struct literals by including an unexported field. The adjusted definition of unpack would follow that principle.

apparentlymart commented 11 months ago

Regarding the "why" of this (as invited in https://github.com/golang/go/issues/63221#issuecomment-1741821681) my desire for it is focused on one specific annoyance with current Go:

Function arguments and function return values are currently different than everything else in Go. They are "struct-like" in that they are a sequence of independently-type fields, but the syntax for using them is significantly different than for structs and there are no language features for conveniently and generically bridging the two.

From a practical standpoint, that means it takes an unwieldy amount of adapter code to deal with implementation details such as:

An API which externally "looks like" function call syntax at both ends but internally sends the arguments and/or return values over a channel. (e.g. chan struct(int, string))
A debugging-oriented or testing-oriented wrapper or mock for some interface that wants to produce a slice representing the arguments and return values of all calls. (e.g. []struct(int, string))

In other words, arguments and return values are a non-orthogonal "wart" in a language that otherwise does a reasonably good job of being orthogonal, and aspires to be so. This "tuple struct" concept would not have been my first choice for healing that gap, but it seems like the most pragmatic way I've seen so far to heal it without changing too many fundamentals of the language.

Today it's typical to deal with these concerns by writing a named struct type with fields of the same names and types as the relevant part of the function signature and then hand-write what are essentially specialized versions of the proposed pack and unpack to translate between the function signature forms and the struct forms. Generics allow some reduction of that boilerplate, but still at least require one struct type and one pair of adapter functions for each arity.

This proposal "paves the cowpath" by introducing a more concise syntax for those structs that has a similar appearance and "essence" to an argument list, and by providing built-in pack and unpack functions that cannot be provided as generic library code within the constraints of Go's current generics features. This allows achieving the same effect as the most typical (as far as I know) solution to this problem today, with considerably less distracting boilerplate.

The other implications of this proposal -- such as being able to "unpack" a struct Point { X, Y float64 } into a pair of float64 variables -- seem interesting too, but for me this bridge between two important-but-incompatible concepts in the language feels the most significant. My weighing of this particular problem far higher than any others is also why I previously argued that the JSON serialization of a "tuple struct" ought not to be a huge concern, but of course others seem to differ on that; I don't feel strongly about it either way.

For what it's worth, I independently imagined a very similar design in an earlier tuple discussion -- far more clunky in the details, but the same essence. The fact that two people starting from a similar problem statement independently devised essentially the same solution to that problem might be a useful signal, though of course with only two people it's not a strong signal, so I mention it only to complement the other argumentation here.

dsnet commented 11 months ago

As another why: my interpretation of why maps.Entries (#54012) was rejected because there was fear that maps.Entry[K, V] would end up getting used everywhere as a generic pair type. This is really pointing at the fact that Go lacks easy tuple-like data structures.

jimmyfrasche commented 11 months ago

I've updated the proposal per @jba and @dsnet's separate suggestions to limit unpack to structs that only contain exported fields

jimmyfrasche commented 11 months ago

@dsnet thanks for the numbers and examples. I'm surprised there are so many F%d even in testing code, but that is an argument in favor of this proposal. If there are so many that happened onto the same arbitrary scheme there are likely many more using other schemes in addition to others using ad hoc names.

At the end of the day, my concern ultimately comes down to changing behavior of existing code.

What is the change that you are concerned about? There is nothing that changes in the proposal. There has been some discussion in this thread, mostly related to json, that would introduce changes and I've said we should not do that as it would introduce changes (without opting in).

ianlancetaylor commented 11 months ago

As far as unexported fields go, currently code is permitted to read unexported fields in the same package and is not permitted to read unexported fields in different packages. It seems clear to me that unpack should follow that same rule. You can unpack a struct exactly when you are permitted to refer to each of the fields individually. So (disregarding complex cases of struct embedding) you can unpack any struct defined in the same package, but when using unpack with a struct defined in a different package all the fields must be exported.

golang / go

proposal: spec: tuples as sugar for structs #63221

61920

32941

33080 (not tuples per se but related)