microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
101.01k stars 12.48k forks source link

Proposal: int types #4639

Closed ivogabe closed 6 years ago

ivogabe commented 9 years ago

This proposal introduces four new number types: int, uint, int<N> and uint<N>, where N is any positive integer literal. An int is signed, it can contain negative integers, positive integers and zero. An uint is an unsigned integer, it can contain positive integers and zero. int<N> and uint<N> are integers limited to N bits:

This proposal doesn't use type information for emit. That means this doesn't break compilation using isolatedModules.

Note: since JavaScript uses 64 bit floating point numbers, not all integers can be used at runtime. Declaring a variable with type like uint<1000> doesn't mean it can actually store a number like Math.pow(2, 999). These types are here for completeness and they can be used in type widening as used in type inference of binary operators. Most languages only support integer types with 8, 16, 32 and 64 bits. Integers with other size are supported because they are used in the type inference algorithm.

Goals

Ideas were borrowed from:

int and uint are both subtypes of number. These types are the easiest integer types. They are not limited to a certain amount of bits. int<N> and uint<N> are subtypes of int and uint. These types have a limited size.

In languages like C, the result type of an operator is usually the same as the input type. That means you can get strange behavior, and possibly bugs:

uint8_t a = 200;
int b = a + a; // = 144 = 400 % 256, not 400
float c = 1 / 2; // = 0, not 0.5

To mimic that behavior we would need to add type converters everywhere in the emitted code, and those heavily rely on type information. We don't want that, so instead we widen the return types. That means there is no runtime overhead. For example, adding two values of type uint<8> would result in uint<9>. To assign that to a uint<8>, you can use an conversion function, like uint<8>(x). That function converts x to an uint<8>.

This design means that operations like x++, x += 10 and x /= 2 are not allowed, as they are not safe. Instead you can use x = uint<8>(x + 1) and x = uint<8>(x / 2).

int and uint allow more operations. Since they are not defined with a limited size, x++ and x += 10 are allowed. x-- is only allowed on an int, as an uint might become negative. Below is an overview of which operators are allowed on which number types.

Since emit isn't based on type information, integers can be used in generics. They can also be used in unions.

Assignability

If a type is not assignable to some other type, you can use a normal cast (<int<8>> x or x as int<8>), which has no impact at runtime, or a cast function (int<8>(x)).

Cast function

A cast function takes a number and converts it to the target integer type.

Syntax:

int<N>(x);
uint<N>(x);
int(x); // Alias of int<32>(x);
uint(x); // Alias of uint<32>(x);

Note: even though an int doesn't have a fixed size, we use the 32 bit cast function as that's easy and fast JavaScript.

Semantics:

This gives the same behavior as type casts in languages like C. If the operand is not a number, TS should give a compile time error. Emit should succeed (unless --noEmitOnError is set), the operand should be converted to a number at runtime, using same semantics as +x. undefined and null should also be converted the same way.

Implemented in TypeScript:

function castToInt(n: int, x: number) {
    x = +x;
    const m = Math.pow(2, n - 1);
    if (x > 0) {
        x = Math.floor(x);
        while (x > m) x -= 2 * m;
    } else {
        x = Math.ceil(x);
        while (x < -m) x += 2 * m;
    }
    return x;
}
function castToUint(n: int, x: number) {
    x = +x;
    const m = Math.pow(2, n);
    if (x > 0) {
        x = Math.floor(x);
        while (x > m) x -= m;
    } else {
        x = Math.ceil(x);
        while (x < 0) x += m;
    }
    return x;
}

These functions are not always used in the generated code. When n <= 32, these functions are not needed. The generated code:

If n === 32:

int<32>(x); // TS
x | 0; // JS

uint<32>(x); // TS
x >>> 0; // JS

If n < 32,

uint(x);
uint<n>(x);
(x | 0) & b; // where b = pow(2, n) - 1

int(x);
int<n>(x);
(x | 0) << a >> a;
// where a = 32 - n

// Examples:
int<8>(x);
(x | 0) << 24 >> 24

uint<8>(x);
(x | 0) & 255;

Question: can we make the emit of int<n>(x) better? The current isn't very nice and performs bad. Solved it using http://blog.vjeux.com/2013/javascript/conversion-from-uint8-to-int8-x-24.html

If n > 32:

uint<n>(x);
__castToUint(n, x);

int<n>(x);
__castToInt(n, x);

__castToUint and __castToInt are the functions above, emitted as helper functions.

You can only use these cast functions in call expressions:

int(x); // Ok
let intFunction = int; // Error
[1].map(int); // Error

We cannot solve that with helper functions, as int wouldn't be equal to int if they don't come from the same file when using external modules. Instead we should dissallow this.

Type inference

Introducing integer types can break existing code, like this:

let x = 1;
x = 1.5;

But we do want to type 1 as an int:

let y: int = 1; // Don't show error that `number` is not assignable to `int`

There are several options:

  1. If a type of a variable is infered from a numeric literal, always infer to number.
  2. If a type of a variable is infered from a numeric literal, always infer to int. This would break some code, but we could accept that since this gives the most predictable behavior.
  3. Only infer to an integer if at least one part of the expression (variable, function call, ...) is explicitly typed as an integer.

In option two we infer to int, as let a = 0 would otherwise infer to uint<1>, which would mean that the variable can only contain 0 and 1.

Examples of option 3:

let a: int = 1;
let b = a + 3; // int
let c = 1 + 1; // number

function returnInt(): int {
    return 1;
}
let d = returnInt() - 1; // int
let e = int(1) * 1; // int
let f = <int> 1 + 1; // int
let g = <number> a + 1; // number

A literal will be infered to the smallest integer type that can contain the number. Examples:

0 -> uint<1>
1 -> uint<1>
2 -> uint<2>
-1 -> int<1>
-2 -> int<2>

Operators:

Operators should always infer to the smallest integer type that can contain all possible values.

(int, int<N>, uint or uint<N>) + number -> number
int + any integer type -> int
uint + (uint or uint<N>) -> uint
int<N> + int<M> -> int<max(N, M) + 1>
int<N> + uint<M> -> int<max(N, M + 1) + 1>
uint<N> + uint<M> -> uint<max(N, M) + 1>

- is almost the same as +, with two exceptions:
uint - (uint or uint<N>) -> int
uint<N> - uint<M> -> int<max(N, M) + 1>

int * (uint, int) -> int
uint * uint -> uint
int<N> * int<M> -> int<N + M>
(int<N + M - 1> is not big enough, consider -2^(N-1) * -2^(M-1) = 2^(N + M - 2)

/ always return `number`

int % (int, uint, int<N>, uint<N>) -> int
uint % (int, uint, int<N>, uint<N>) -> uint
int<N> % (int, uint) -> int<N>
uint<N> % (int, uint) -> uint<N>
int<N> % int<M> -> int<min(N, M)>
int<N> % uint<M> -> int<min(N, M+1)>
uint<N> % int<M> -> uint<max(min(N, M - 1), 1)>
uint<N> % uint<M> -> uint<min(N, M)>

int & (int or int<N>) -> int<32>
int<N> & int<M> -> int<min(N, M, 32)>
uint<N> follows the rules of int<N+1>
(uint & uint !== uint, for example 4294967295 & 4294967295 === -1)

| and ^ have the same behavior as &

~int -> int
~uint -> int
~int<N> -> int<N>
~uint<N> -> int<N + 1>

<< always returns an int<32>

(number, uint) >> any -> uint<32>
int >> any -> int<32>
int<N> >> any -> int<min(N, 32)>
uint<N> >> any -> uint<min(N, 32)>

(number, uint or int) >> any -> uint<32>
int<N> >>> any -> uint<32> (Consider -1 (int<1>), -1 >>> 0 === 429467295 === max value of uint<32>
uint<N> >>> any -> uint<min(N - 1, 32)>

Certain assignment operators are not supported on integer types. In short: Let Op be an operator. x Op= y is allowed iff x = x Op y is allowed. That means that the following operators are not supported and usage will give an error:

int: x /= y

uint: x /= y, x--, --x

int<N>, uint<N>: x /= y, x++, ++x, x--, --x, x += y, x -= y, x *= y

int<N>
if N < 32, then: x <<= y
if N <= 32, then: x >>>= y

uint<N>
if N < 32, then: x <<= y, x >>>= y

Breaking changes

Type inference can change depending on how it will be implemented. Also changing existing definitions can break things:

let arr = [];
let length = arr.length;
length = 1.5;

Changing the definition of the length property to a uint would break this code, though I don't think this pattern will be used a lot. Such problems can easily be fixed using a type annotation (in this case let length: number = arr.length;).

Questions

  1. Should integer types be namespaced under number? Thus, let x: int vs. let x: number.int
  2. Which syntax should be used for sized integers? int<8>/number.int<8> or int8/number.int8?
  3. Can we make the emit of int<n>(x) (n < 32) better? The current looks and performs bad. Solved using http://blog.vjeux.com/2013/javascript/conversion-from-uint8-to-int8-x-24.html
  4. How should type inference be handled? See the question above, in paragraph called 'Type inference'.
  5. Can undefined and null be assigned to an integer type? The conversion functions convert them to 0. Allowing undefined would mean that int + int could be NaN (if one operand is undefined), while NaN is not assignable to an int. I'd say that undefined and null shouldn't be assignable to an integer type, and that declaring a variable (also class property) with an integer type and without an initializer would be an error.

All feedback is welcome. If you're responding to one of these questions, please include the number of the question. If this proposal will be accepted, I can try to create a PR for this.

benatkin commented 6 years ago

GraphQL supports signed 32 bit integers and doesn't address other integer types. IMO that's a great starting point. http://graphql.org/learn/schema/

tarcieri commented 6 years ago

@benatkin if TC39 BigInt ever ships, there's a pretty big drawback to defining integer types based on number: TC39 BigInt is a separate type at the VM level, and one which for which an exception is raised if mixed number/BigInt arithmetic is attempted.

If people were to add int32 types to their type declarations for untyped JavaScript code which are actually applying constraints to number, and TypeScript were to switch to TC39 BigInts as the underlying JavaScript backing type for int32, then all code which has int32 in their type declarations would be broken.

The TC39 BigInt proposal is now stage 3 and I think stands a pretty decent chance of eventually shipping, so I think it's probably best to "reserve" any potential integer types for ones which are actually integers at the VM level.

ivogabe commented 6 years ago

BigInt is scheduled for TS 3.0 (#15096) 🎉! Adding different syntax for "double based integers" as proposed in this issue would be confusing, so I think we can close this. See also #195.