Types

Top types

Types that don't have any proper supertype. These are

Bool
Int
Text
(List TMember)
(Table TKey TValue)
(Func Tout Tin1 Tin2 ...)
Void

All of these types can be used as types of variables and nested as type arguments of other types, except for Void - this can only be a type of an expression and it is never nested. (I removed Array = a subtype of List should be used instead, and Set - poor support and not that much important).

Subtypes

Top types have subtypes, similar to now. Every type is a subtype of exactly one top type. (List TMember) has subtypes (List TMember TLength) and a special internal type EmptyList - the only type that has multiple tops as supertypes. The only value with the EmptyList type is (list) and this type nor any type containing it cannot be used as a type of a variable.

Variable declarations and scopes

Each variable must be declared before it is referred to. There are three ways to declare a variable:

Annotation on the variable node in an assignment node. Such variable can be read from and written to anywhere in following siblings of the block or in descendants of those. Only expressions with a subtype of the declared type can be written to this variable. Expressions read from this variable are guaranteed to types that are subtypes of the declared type (but might be narrower, based on the context). The type in the declaration can be omitted, in which case the top supertype of the assigned type is used, unless it contains the EmptyList type. $i:<-0; is equivalent to $i:Int<-0;.
Loop variable of a for loop. Such variable is readonly and can only be read from in the body of the loop and in all descendants of the body and the type of the expression will be a subtype of the type of members of the passed list.
Arguments of a function. Such variable can only be read from and the type can either be annotated on the argument or on the variable that this function is assigned to.

Referring to un undeclared variable is an error. Redeclaring a visible variable is an error (no shadowing). Declaring a variable twice in two disjoint scopes is fine.

Variable type narrowing

On each assignment in addition to checking that the type of the assigned expresion is a subtype of the declared type, the assigned type is stored and used whenever the variable is read between this and the next assignment. Control flow analysis should be used to determine what "between" means. The union of all the assignments to a variable is calculated and defines the "global inferred type" of the variable. This will sometimes be equal to the declared type and sometimes be a proper subtype. The global inferred type of a variable can be used to determine the target type in the target language (for example using arrays instead of list, bounded int instead of bigint etc.). Conditions (If, While, ConditionalOp) can be used to further narrow the types of variables inside of them.

Type annotations

One, already described function of type annotations is declaring variables. The other is narrowing inferred type of expressions. Type on a expression overrides the inferred type of that expression, provided it is its subtype, otherwise is an error.

MichalMarsalek commented 10 months ago

Since functions are contravariant on their input types, one can create arbitrary function supertypes by using argument subtypes. Solutions?

Accept the fact that a top for a given type is not unique. It's already broken for EmptyList anyway.
Instead of functions, there' are macros. All macros have type Macro. Variables of type Macro cannot be reassigned and can only be defined at top level. When macro is used, we look inside of its definition to do type inference. This simplification might be useful in general.

jared-hughes commented 10 months ago

Is there no boolean type?

Some notes:

Desmos "functions" are actually macros. So f(L) = L[5]; f([1,2,3]) can compile without needing to infer L as a list in the definition of F. This ends up working well-enough in general.
It's reasonable to put special cases for EmptyList. But you might need to infer a "top" for every EmptyList variable. E.g. in Python, x = []; x.append(5). A strongly-typed language would need to infer the type of x ((List Int)) at the declaration time of x.

MichalMarsalek commented 10 months ago

Is there no boolean type?

I think it makes perfect sense for Bool to be a synonym to 0..1 instead of a distinct type in a structural type system, but I actually just forgot to mention it. It's probably not a good fit in the context of transpilation.

But you might need to infer a "top" for every EmptyList variable. E.g. in Python, x = []; x.append(5). A strongly-typed language would need to infer the type of x ((List Int)) at the declaration time of x.

That's why I disallowed variables of type EmptyList and omitting the declared type if the RHS includes the EmptyList type. Anything else seems like unnecessary complication.

Btw, in current Polygolf, there is already a special type for empty list. It's called (List void) not EmptyList and it's special cased to be a subtype of (List T) for any T.

MichalMarsalek commented 7 months ago

Superseded by #383 Variables & Types 2

polygolf-lang / polygolf

Variables & Types #354