winglang / wing

A programming language for the cloud ☁️ A unified programming model, combining infrastructure and runtime code into one language ⚡
https://winglang.io
Other
5k stars 196 forks source link

Spread operator (`...`) in structs, arrays(variadics), maps, and jsons #3855

Open staycoolcall911 opened 1 year ago

eladb commented 11 months ago

@staycoolcall911 can we bump this to P1? It's a pretty major missing capability when implementing APIs.

staycoolcall911 commented 11 months ago

@eladb sure, done, but can you please explain where you need it? It is supported on function declarations (variadic args). Did you need it for structs, jsons or both?

eladb commented 11 months ago

The common use case is, for example, when a function accepts a struct and calls a lower level function with some variations:

struct Options {
  bar: str;
  // many more options...
  zig: str?
}

let myFunc = (opts: Options) => {
  lowerLevel({ ...opts, zig: "hello" });
};
skorfmann commented 9 months ago

@MarkMcCulloh suggested this workaround today

let parsedEventRow = Json { a: "a"};

let x = MutJson { b: "b" };

for e in Json.entries(parsedEventRow) {
  x.set(e.key, e.value);
}

assert(x.get("b").asStr() == "b");

Would be really nice if this worked natively though. Wanted to adjust a JSON object to being parsable as struct. So this overlaps with https://github.com/winglang/wing/issues/3686 - which would also be really helpful.

skyrpex commented 6 months ago
image
eladb commented 6 months ago

Yes please!

Chriscbr commented 4 months ago

Here's a proposal for a spread rules specification:

...exp is a new syntax element that can be used as one or more of the items within any object literal (i.e. a struct literal, map literal, set literal, array literal, or a Json literal), where exp can be substituted with any valid expression.

Example: inside a struct literal, instead of only supporting a comma-separated list of key value pairs, each comma-separated item can be a key value pair OR a ...exp item. MyStruct { ...x, y: z, ...cool.beans() } should be valid syntax.

Example: inside an array literal - [0, ...[1, 2, 3], 4]

For type safety, all of the elements of the object literal must be compatible with the object type. That is:

If the literal type is an Array<T> or MutArray<T>, then ...exp is valid whenever exp is an Array<T> or MutArray<T> with the same element type.

Example: Array<num>[...mutArrayOfNum] is valid, as well as MutArray<num>[...immutArrayOfNum].

If the literal type is an Set<T> or MutSet<T>, then ...exp is valid whenever exp is an Set<T> or MutSet<T> with the same element type.

If the literal type is an Map<T> or MutMap<T>, then ...exp is valid whenever exp is an Map<T> or MutMap<T> with the same element type.

If the literal type is an Json or MutJson, then ...exp is valid whenever exp is an Json or MutJson or any type that can be upgraded to Json (see "is_json_legal_value" in the compiler).

If the literal type is a struct, then ...exp is valid whenever exp is of the same struct type, or a struct type that the original struct extends. If exp is the same struct type, then MyStruct { ...exp } has no errors - that is, no additional fields need to be specified for the expression to be valid. But if ...exp is a parent struct type, then the type checker should determine which fields are missing, and raise an error to the user if appropriate.

Example: Given the following code:

struct Pet { name: str }
struct Dog extends Pet { treats: num }

let p = Pet { name: "duncan" };
let d = Dog { name: "max", treats: 5 };

Dog { ...p } would be invalid since "treats" is missing. Dog { ...d } would be valid Dog { ...d, ...p } and Dog { ...d, treats: 5 } would be valid too

If a function or constructor has a trailing variadic argument like ...Array<str> or ...Props, then the spread syntax can be used to pass a satisfying value, as in func(...myArray) and new Bucket(...defaultBucketProps, public: true).

[1] In the future we can add smarter type inference. For example, through subtyping, it may be possible to infer that [...arrayOfDog, ...arrayOfCat] is valid under the unified type Array<Pet>. This requires more compiler magic 🪄

MarkMcCulloh commented 4 months ago

Thoughts on allowing ... in struct expansion?

let func = (arg: Dog) => { };

let p = Pet { name: "bob" };
func(...p, treats: 2);

If the literal type is a struct, then ...exp is valid whenever exp is of the same struct type, or a struct type that the original struct extends

What's goal of this restriction? Is it just to maintain consistency with the nominal typing of structs? It seems useful to allow any structs to participate in this with each other. If the concern is avoiding extraneous fields, our implementation of ... can explicitly enumerate known fields (unlike JS, where that's not statically known)

To me this construction seems natural

struct S1 { a: num; }
struct S2 { a: num; b: num; }

let s1 = S1 { a: 1 };
let s2 = S2 { ...s1, b: 3 };
// which is effectively just sugar for:
let s3 = S2 { a: s1.a, b: 3 };
Chriscbr commented 4 months ago

Thoughts on allowing ... in struct expansion?

let func = (arg: Dog) => { };

let p = Pet { name: "bob" };
func(...p, treats: 2);

Yep, I think this makes sense. It can basically work like a syntax sugar for:

func({ ...p, treats: 2 });

What's goal of this restriction? Is it just to maintain consistency with the nominal typing of structs? It seems useful to allow any structs to participate in this with each other. If the concern is avoiding extraneous fields, our implementation of ... can explicitly enumerate known fields (unlike JS, where that's not statically known)

You're right, it was motivated by consistency with the existing nominal typing. There are at least three reasons I can think of for this nominal typing restriction, and I'll list them here partially just because it's helpful for me to put my thoughts to paper 😀 - but I should preface it by saying none of these reasons are critical per se.

The first reason is one of the usual reasons for nominal typing, which is that the nominal typing helps ensure types are used with their intended (semantic) purpose, and prevent bugs like this:

struct Celsius { value: num; }
struct Fahrenheit { value: num; }

let tempC = Celsius { value: 25 };
let tempF = Fahrenheit { value: 77 };

let temp = Fahrenheit { ...tempC }; // oops!

Another reason for nominal typing is that structural typing introduces the question of "are structs with extra fields always subtypes of the previous structs?" If they are subtypes, then it can result in extra information getting shared, which could be important depending on the circumstances.

struct Foo {
  name: str;
}

let table = new mongodb.Table();
let storeFoo = (foo: Foo) => {
  // uh oh, now I'm storing way more data than I expected...
  table.store("key", foo.toJson());
  // and now I'm logging API tokens...
  log(foo);
};

struct Bar {
  name: str;
  apiKey: str;
}

// is this allowed (is Bar a subtype of Foo)?
logFoo(Bar { name: "Bob", apiKey: "abc123" });

Put another way, it's useful to be able to say "if a struct has this type, these are the only fields it has". My understanding is the wealth of validation libraries in TypeScript is related (at least in some part) to the structural typing of the language. That said, Winglang's support for structs-extending-structs might already cause similar issues.

The last reason is related to how we answer questions like "does removing a field from a struct count as a breaking change to a library? what about adding a field to a struct? what about adding an optional field?" These questions are important because they decide in what ways Wing libraries are allowed to grow and what mechanisms the authors have for safely adding options and capabilities without breaking downstream users.

Let's suppose we want it to be the case that adding an optional field to a struct is not a breaking change. I think that's the case in Wing today since structs are nominally typed.

If structs were structurally typed, and we had the example below:

// foolib.w
pub struct Foo {
  key: num;
}

// barlib.w
pub struct Bar {
  key: num;
  value: num;
}

// main.w
bring bar;

let makeFoo = /* */;

let foo: Foo = makeFoo();
let bar = Bar { ...foo, b: 8 };

If the author of the foo library added value: str?; to Foo, then we can see it would break user code since Foo would no longer be compatible with with Bar on the last line (value: str? and value: num are incompatible).

It's not impossible to work around these issues, for example the foo library author could make their type branded -- but it feels a bit weird for this to be opt-in rather than opt-out.

Ultimately, I don't know if there's any right answer to these questions. I also in the back of my mind am thinking about how Flow prioritized safety / correctness more than productivity for developers over TypeScript, and well, we can see which won. So it might makes sense to switch to structural typing in the end...

(There's also the thought of supporting both nominally typed objects and structurally typed objects, and just give them different names -- this is also possible, it might just be more confusing to new users right now. Or maybe not, IDK.)

MarkMcCulloh commented 4 months ago

nominal typing helps ensure types are used with their intended (semantic) purpose, and prevent bugs like this

struct Person { name: str; }
struct Dog { name: str; }

let p = Person { name: "bob" };
let d = Dog { name: "jim" };

let temp = Dog {
  ...p
};

In this very similar example, the meaning of "name" in both structs can be the same. I've done nothing to imply they should be incompatible. You could say that they should both then extend a Named { name: str; } struct of some sort instead, but this is pointless ceremony because it hasn't told me anything new (I already knew they both had a name!) and I may not even need such a struct. It's also possible that you don't have control of one of the types so you can't add that relationship anyways.

You mentioned branded types, which do seem like a better solution to me. Being opt-in means you avoid the compiler attaching meaning that you may not intend.

branded type C = num;
branded type F = num;
struct Celsius { value: C; }
struct Fahrenheit { value: F; }

let tempC = Celsius { value: C 25 };
let tempF = Fahrenheit { value: F 77 };

let temp = Fahrenheit { ...tempC }; // oops!

// This has the added benefit of avoiding the problem even without `...`
let temp = Fahrenheit { value: tempC.value }; // still oops!

Another reason for nominal typing is that structural typing introduces the question of "are structs with extra fields always subtypes of the previous structs?" If they are subtypes, then it can result in extra information getting shared, which could be important depending on the circumstances.

If this is a problem I think this could be a failing of our implementation. In your example:

(foo: Foo) => {
  table.store("key", foo.toJson());
  log(foo);
};

.toJson() and log should not have access to apiKey. Just because the data is there does not mean wing should allow it to be accessed. Both of these functions should statically only know about fields associated with Foo. Same thing applies to lifting, i.e. only known data should be lifted.

We do have unsafeCast() but that kinda defeats the purpose of the discussion. This is like allowing a function to take an array slice as a function arg in rust. If there is more data before or after the slice, I believe you can technically access it (with an unsafe poiner).

"does removing a field from a struct count as a breaking change to a library? what about adding a field to a struct? what about adding an optional field?"

This point is fair, specifically regarding the optional field issue. Personally, I think the expressiveness of allowing this at the cost of adding a new (and rare, IMO) way to have a breaking change is okay, but I have no justification for that.

eladb commented 4 months ago

The discussion seems related to the protocol proposal to unify the concept of interfaces and structs.

https://github.com/winglang/wing/discussions/5102#discussioncomment-8385958

If this is the direction we want to go, then it’s pretty natural for protocols to be structurally typed, like interfaces are today.

If we agree that protocols are where we want to go, then I think it makes sense to assume that structs are structurally typed.

Chriscbr commented 4 months ago

Being opt-in means you avoid the compiler attaching meaning that you may not intend.

That's fair, yeah - some way to define branded types might be the right solution for these cases.

If this is a problem I think this could be a failing of our implementation. ... .toJson() and log should not have access to apiKey.

I'm not sure, but I'm concerned that addressing this require some form of runtime validation (at least, if we were to allow FooWithExtraFields objects to be passed to a function that only accepts a Foo object -- if we don't allow that kind of thing, then the rest of this paragraph can be ignored). That is, when the function is called, or at the beginning of the function body, the compiler might generate some code to sanitize foo so that the function is only handling the the statically known fields. But the implementation could get funky once we start introducing objects with getters etc. - so maybe we should avoid this for now...

If we agree that protocols are where we want to go, then I think it makes sense to assume that structs are structurally typed.

I'm for trying to unify interfaces and structs and update the protocol proposal (it needs some clean up), but I think we should avoid coupling these issues. If we start with this initial approach for spreading it's easy to loosen the type checking and allow more cases later.

eladb commented 4 months ago

Fair enough