dfinity / agent-js

A collection of libraries and tools for building software around the Internet Computer, in JavaScript.
https://agent-js.icp.xyz
Apache License 2.0
155 stars 96 forks source link

More ergonomic runtime representation of variants #467

Closed cristianoc closed 3 years ago

cristianoc commented 3 years ago

Is your feature request related to a problem? Please describe. The following variant declaration

  type abc = { #a; #b; #c };

has this runtime representation:

export type abc =
  { 'a' : null } |
  { 'b' : null } |
  { 'c' : null };

Here are patterns one could try to use for a function to check whether a value represents #a:

if(x.a) ... // wrong
if(x.a == null) ... // wrong
if(x.a === null) ... // right

There's no help that the TypeScript type checker can give to figure out the correct pattern.

Describe the solution you'd like One possible idiomatic representation, at the cost of some reduced uniformity (variants with no arguments represented differently from variants with arguments):

export type abc = "a" | "b" | "c";
chenyan-dfinity commented 3 years ago

if (x.hasOwnProperty('a')) also works?

Yes, for nullary constructors, it's better to use plain string. For non-nullary constructors, we can also use kind, e.g.

type T = { kind: "a" } | { kind:"b", field_a: T1, field_b: T2 };
cristianoc commented 3 years ago

if (x.hasOwnProperty('a')) also works?

Yes that also works.

Yes, for nullary constructors, it's better to use plain string. For non-nullary constructors, we can also use kind, e.g.

type T = { kind: "a" } | { kind:"b", field_a: T1, field_b: T2 };

Indeed, this would support TypeScript's discriminated unions and checks for exhaustive pattern matching. The fields would probably need numeric identifiers such as field_0, field_1 or even _0, _1.

From a quick scan of the candid spec, it seems that because of sub typing:

Mentioning this as in the mixed case, then one would not be able to pattern match all cases in TS with a simple switch (x.kind) but would need some top-level typeof check. So specialising nullary constructors to strings makes it more ergonomic for nullary-only types, and less ergonomic for mixed types.

chenyan-dfinity commented 3 years ago

The fields would probably need numeric identifiers such as field_0, field_1 or even _0, _1.

Depends on the Candid/Motoko type. #a: Nat is { kind: "a", _0_: bigint }; #a: { field_a: Nat } is { kind: "a", field_a: bigint }.

in the case of mixed nullary+nonnullary constructors, then nullary ones must still be represented as strings.

I don't see why that's the case. For mixed types, we will use kind for nullary constructors to keep the representation consistent. We only use strings when all constructors are nullary.

In general, I think it should be the developers' choice to decide how to represent candid types in the host language. We probably need a config language to guide the compiler for how to generate bindings. This problem becomes more prominent for Rust bindings, considering one candid type can map to multiple types in Rust with different lifetime, mutability, reference, etc.

cristianoc commented 3 years ago

Oh that's right. I was wondering about the case where one begins with nullary-only. Then an upgrade adds one non-nullary case. Then a new client version wants to take advantage of that new case. I guess it just means that the new client will use a different representation, and the client code needs to change a bit (guided by the type system) when operating on the bigger type.

cristianoc commented 3 years ago

Another type that is currently not ergonomic is the option type. One possibility for unboxing it is to represent opt t as null | __representation_of_t__ as long as t is not a nullable type. Where nullable type means: null is a possible representation for a value of type t. I guess that means that null and option ... are nullable types.

chenyan-dfinity commented 3 years ago

I was wondering about the case where one begins with nullary-only. Then an upgrade adds one non-nullary case. Then a new client version wants to take advantage of that new case.

Right. The promise of subtyping is that when one party upgrades the interface, the other party with the old interface can still decode the message. When the server side adds a non-nullary constructor, the JS side with the old interface will ignore the new field, and considers the type to be nullary constructors only. So the representation in JS side is unchanged. But when the JS side upgrades its interface, it's more work to change all the existing patterns. That's a cons for specializing on nullary constructors.

represent opt t as null | representation_of_t as long as t is not a nullable type.

Agreed. That's a good suggestion. We didn't explicitly define nullable types in the spec, but it Coq, it means null, opt t and reserved.

cristianoc commented 3 years ago

Also, type variables would have to be considered nullable. So <X> ...opt<X> would be boxed while some specific instances opt<Nat> would be unboxed.

chenyan-dfinity commented 3 years ago

We don't have type variables in Candid. All types are monomorphized when translating to Candid. Things may change with https://github.com/dfinity/candid/issues/245, but most likely generic data will be a syntactic sugar or an opaque blob.