Open Naddiseo opened 9 years ago
TypeHint would likely be similar to extends
on class. A TypeHint, at the base, would be a constructor / class, such as String
, Number
, etc. Although "foo" instanceof String
return false - but for type hinting we would want them to match (autoboxing).
If the engine enforces the types, I'm thinking along the lines of:
function add(a Number, b Number) Number {
return a + b;
}
// desugars to
function add(a, b) {
a instanceof Number || console.warn("arg 0 not a Number");
b instanceof Number || console.warn("arg 1 not a Number");
var _return = a + b;
_return instanceof Number || console.warn("return not a Number");
return _return;
}
Probably shouldn't halt the execution and log a warning instead. If the engine doesn't enforce it, then a linter could enforce it, but it may not catch everything.
We should probably track these issues (other than what is a TypeHint) separately.
The simplest route would be for TypeHint
to be a single identifier, then have the static semantics to limit it to one of the global primitive objects, but lets explore all the alternates before that.
You've suggest in other places that the TypeHint
be similar to, if not the same as, LeftHandSideExpression which is what is used for the class extends
syntax. This would allow bizarre hints like:
function f1() {foo: Function}.foo {}
function f2() new Function {} // what would this even mean?
function f3() 1 {} // syntactically valid, but semantics would probably disallow
function f4() function () function () {} {} {} // ad infinitum
var fn = arg function () class {} {} => 1;
But, on the other hand, it does allow some useful expressions:
// Inline classes: can allow the linter/checker to know the interface what what it should be checking
function f1() class TypeInterface { /* things */ } {}
// Use of a special "function" to notate optional, union types, and casting
import {Optional, Union, CastTo, List} from "@types";
function f2() Optional(Number) {}
function f3() Union(Number, String) {}
var a Number = CastTo(Number, "2");
function f4(...args Number) {}
function f5(arrayType List[Number]) {}
// Again, specifying an interface of the return type.
function f5() { interface: attributes } {} // or is this a syntax error? extends seems to allow it
Allowing that function call with special "functions" would resolve most of the issues I've raised, assuming that the linter/checker knows how to interpret them, especially generics (#6) and nullable/optional (#13). Using LeftHandSideExpression
can also potentially resolve rest/spread (#5), and casting (#15).
It may also be possible to resolve aliases (#7), and any/void (#14):
import { Callback, Any } from "@types";
// function Callback(RetType Function, ...args List[Any]) Function {}
var ToNumberCallbackType = Callback(Number, Any);
function getToNumber(type String) ToNumberCallType {
return {
"string": Number,
"A": function (arg A) Number { return arg.toNumber(); }
}[type];
}
This library based approach is also used by mypy for python's type hinting, and is geared more towards static / compile time checking/linting than for runtime checking. (Also tracked in #9, and mentioned in #15).
Regarding the library itself: if it were solely a compile time thing, the library import and annotations can be stripped out of the resulting file after transpilation.
How does this work exactly?
function f2() Optional(Number) {}
function f3() Union(Number, String) {}
Given that checking the type would likely be done with an instanceof
operator. What does Optional
return? What does Union
return? How does that get fed into instanceof
?
Allowing any syntax does allow for some invalid code. But then again, so does extends
.
The question is: Should "anything" be allowed, or should it be restricted to simple variable references?
There is one use case where "any expression" could be handy:
import types from "./types";
function doSomething() types.foo {
}
// or
function doSomething() types["foo"] {
}
// or
function doSomething() types.get("foo") {
}
Essentially a case where the type isn't on the scope. But then again, could it be restricted to foo
and types.foo
(allow dot syntax).
And with any syntax, where does it end? You could do:
function doSomething() class Foo {
} {
// ... function code
}
(which is completely useless)
Given that import
creates references in the module scope, perhaps it is reasonable that types.foo
isn't allowed. After all you could do something like:
import * from "./types" // imports everything, including "foo"
function doSomething() foo {
}
Lots of pros and cons. Perhaps starting out very restrictive is a good way to start out and then go from there.
How does this work exactly?
function f2() Optional(Number) {} function f3() Union(Number, String) {}
They would work at compile time, and likely not exist in the resulting file. I would imagine a babel transformer that walks the tree:
var transformer = {
walkTypeHint(ast) {
// Transform this proposal, into the estree types.
if (t.isFunctionCall(ast) && ast.name.name == "Optional") { // union[undefined, Type]
return t.UnionTypeAnnotation([t.TypeAnnotation(undefined), t.TypeAnnotation(ast.params[0]]);
}
else if (t.isFunctionCall(ast) && ast.name.name == "Union") {
return t.UnionTypeAnnoation.apply(null, ast.params);
}
}
};
Given that checking the type would likely be done with an instanceof operator. What does Optional return? What does Union return? How does that get fed into instanceof?
Union
doesn't have to return anything if it's removed at compile time.
I don't think you can use instanceof
for anything but simplistic and naive type checking. And using instanceof
runs into issue when trying to typecheck a map/dict/object:
function takesADict(arg Object) { assert(arg instanceof Object); }
takesADict({a: 1}); // passes, and it should
takesADict(new Map()); // passes, should it?
takesADict(new Function()); // passes, should it?
takesADict(function() {}); // passes, but it should not.
// there's also the weird checks with String vs new String vs "".
Alternately, Union
could be implemented something like:
function Union(arg1, arg2) {
return function checker(input) {
assert (input instanceof arg1) || (input instanceof arg2);
};
}
// --- other files ---
function Foo(arg1 Union(A, B)) { } // Gets transpiled:
function Foo(arg1) {
Union(A,B)(arg1); // does the check here
}
However, this approach will incur runtime overhead.
Allowing any syntax does allow for some invalid code. But then again, so does extends.
Semantically invalid, yes. For extends, the semantic check is for "function or null". I think we can probably narrow down what is semantically valid, but we're still at the syntax phase, so we can leave the semantic checking until later.
The question is: Should "anything" be allowed, or should it be restricted to simple variable references?
Yes, another way to phrase the question is: solve this at a syntactic level, or at a semantic level.
If the it's solved at a syntactic level, the resulting hinting system will probably less expressive, and future friendly. Solving it at a semantic level means, that if there are changes down the line, we don't have to change syntax to solve them.
There is one use case where "any expression" could be handy:
import types from "./types"; function doSomething() types.foo { } // or function doSomething() types["foo"] { } // or function doSomething() types.get("foo") { }
Essentially a case where the type isn't on the scope. But then again, could it be restricted to foo and types.foo (allow dot syntax).
This is another situation where I think it's better to solve on the semantics side than the syntactic side.
And with any syntax, where does it end? You could do:
function doSomething() class Foo { } { // ... function code }
(which is completely useless)
I think that depends on how you interpret the hint. It could be defined as:
let interfaceFoo = class Foo { method() {} };
// The arg/return needs to implement the `method` method
function doSomething(arg interfaceFoo) interfaceFoo { return arg; }
doSomething({}); // fails
doSomething(function() {}); // fails
doSomething({ method() {} }); // passes, has a `method` method
doSomething(class { method() {} }); // passes.
Given that import creates references in the module scope, perhaps it is reasonable that types.foo isn't allowed. After all you could do something like:
import * from "./types" // imports everything, including "foo" function doSomething() foo { }
Okay, that case is where it gets a little tricky. You're forced to do runtime checking if you don't want to deal with the import at compile time. Still, that's semantics rather than syntax.
Lots of pros and cons. Perhaps starting out very restrictive is a good way to start out and then go from there.
Okay, if we assume a single identifier, what are the pros/cons have that?
How does this work exactly?
function f2() Optional(Number) {} function f3() Union(Number, String) {}
They would work at compile time, and likely not exist in the resulting file. I would imagine a babel transformer that walks the tree:
var transformer = { walkTypeHint(ast) { // Transform this proposal, into the estree types. if (t.isFunctionCall(ast) && ast.name.name == "Optional") { // union[undefined, Type] return t.UnionTypeAnnotation([t.TypeAnnotation(undefined), t.TypeAnnotation(ast.params[0]]); } else if (t.isFunctionCall(ast) && ast.name.name == "Union") { return t.UnionTypeAnnoation.apply(null, ast.params); } } };
Given that checking the type would likely be done with an instanceof operator. What does Optional return? What does Union return? How does that get fed into instanceof?
Union
doesn't have to return anything if it's removed at compile time.
That all seems very messy.
I don't think you can use
instanceof
for anything but simplistic and naive type checking. And usinginstanceof
runs into issue when trying to typecheck a map/dict/> object:function takesADict(arg Object) { assert(arg instanceof Object); } takesADict({a: 1}); // passes, and it should takesADict(new Map()); // passes, should it? takesADict(new Function()); // passes, should it? takesADict(function() {}); // passes, but it should not. // there's also the weird checks with String vs new String vs "".
Why not? Object
is a generic. It's a quirk of the language sure, but type systems typically respect inheritance.
The biggest issue is String
vs ""
. Technically ""
is not a String
, but it is when autoboxed. As in when you do "foo".substr(1)
what is actually happening behind the scenes is (new String("foo")).substr(1).toString()
.
The runtime checking would likely look something like:
function isType(value, type) {
switch (type) {
case String:
return typeof value === "string";
case Number:
return typeof value === "number";
case Boolean:
return typeof value === "boolean";
}
return value instanceof type;
}
Although this could probably be optimized with jit. Meaning that if you know the type is String
(not a shadowed String
), you can use typeof instead of instanceof. V8 could probably to an optimization like that.
Alternately,
Union
could be implemented something like:function Union(arg1, arg2) { return function checker(input) { assert (input instanceof arg1) || (input instanceof arg2); }; } // --- other files --- function Foo(arg1 Union(A, B)) { } // Gets transpiled: function Foo(arg1) { Union(A,B)(arg1); // does the check here }
However, this approach will incur runtime overhead.
And how do you know the returned value isn't the type? String
is a function too. Is it supposed to run that as well?
Lots of pros and cons. Perhaps starting out very restrictive is a good way to start out and then go from there.
Okay, if we assume a single identifier, what are the pros/cons have that?
- Single Identifier
- Pro: Easy to lint. Don't have to run any code.
- Pro: Easy to optimize at runtime. If you know you're checking for
String
while parsing, you can opt for atypeof
vs aninstanceof
rather than a function that checks every time.- Pro: Very clear and easy to understand.
- Con: No dynamic types?
- Any expression
- Con: Not easy to lint. Have to run code at runtime.
- Con: Not easy to optimize at runtime. Has to be run.
- Con: Has the possibility of being confusing if abused.
- Pro: Dynamic types?
It seems to me a single identifier is the way to go. The only thing between the two of them I can think of is dynamic types. Which brings up another question: Is this legal?
var DynamicType = RandomType();
function doSomething(crazy RandomType) {
}
Even with a single identifier you can do something "dynamic". It's just a round-about way of doing it. Is there any way to only allow class and function declarations? Would it be worthwhile to do so?
A use case we've yet to address is forward referencing:
class A {
getB() B { // B doesn't exist here due to TDZ
return new B();
}
}
class B {
getA() A { return new A(); }
}
Taking hints from #17, and going with a restrictive first draft, TypeHint
should just be:
TypeHint:
StringLiteral
Restricting the type hint to just a string literal has may pros:
Cons:
eval
?)Minor Cons:
I also think that using a single string literal for the hint has the biggest chance of getting the spec past Stage 0 since it's least controversial.
So, just so I understand what you're saying, you mean:
class A {
getB() "B" {
return new B();
}
}
class B {
getA() "A" { return new A(); }
}
Is that correct? I'm not sure many people would agree with that. It looks a bit unusual.
I'm not sure forward referencing is an issue in this case though. At least if you think of the following as being de-sugared to:
class A {
getB() {
let _return = new B();
_return instanceof B || console.warn("return value is not B");
return _return;
}
}
class B {
getA() A {
let _return = new A();
_return instanceof A || console.warn("return value is not A");
return _return;
}
}
True that A and B are undefined at some point, but not until the methods are used.
I've also been tinkering with this idea:
Object.isA = function(value) {
return value instanceof this;
};
String.isA = function(value) {
return typeof value === "string";
};
Although without being on the prototype that doesn't really work on new classes. General idea is the type as an isA
method to do the comparison.
Another way is doing it the other way around:
Object.prototype.isA = function(constructor) {
return this instanceof constructor;
};
String.prototype.isA = function(constructor) {
return constructor === String;
};
Number.prototype.isA = function(constructor) {
return constructor === Number;
};
Boolean.prototype.isA = function(constructor) {
return constructor === Boolean;
};
Then you can do something like this as the de-sugar:
class A {
getB() {
let _return = new B();
_return.isA(B) || console.warn("return value is not B");
return _return;
}
}
class B {
getA() A {
let _return = new A();
_return.isA(A) || console.warn("return value is not A");
return _return;
}
}
Consider what is currently in use (taken from JSDoc website):
/**
* Returns the sum of a and b
* @param {Number} a
* @param {Number} b
* @param {Boolean} retArr If set to true, the function will return an array
* @returns {Number|Array} Sum of a and b or an array that contains a, b and the sum of a and b.
*/
function sum(a, b, retArr) {
if (retArr) {
return [a, b, a + b];
}
return a + b;
}
// vs
/** Returns the sum of a and b */
function sum(a "Number", b "Number", retArr "Boolean") "Number|Array" {
if (retArr) {
return [a, b, a + b];
}
return a + b;
}
Using an inline TypeHint is subjectively better, and objectively easier to type, and more "attached" to the thing it hints. What I'm suggesting is that the jsdoc have a place inline. However, saying that what is inside the must be a JSDoc is too opinionated.
I've been thinking about something like this:
Object.prototype.isA = function(constructor) { return this instanceof constructor; }; String.prototype.isA = function(constructor) { return constructor === String; }; Number.prototype.isA = function(constructor) { return constructor === Number; }; Boolean.prototype.isA = function(constructor) { return constructor === Boolean; };
The main issue I see with something like that, is that it forces the actual checking to be done with instanceof
or typeof
. Given that the current three major implementations of type checking in ES (closure compiler, typescript, flow) all use something more sophisticated, I think it would be a mistake to push that kind of limitation in the spec. Also, that's type checking, not type hinting.
My thoughts:
instanceof
/typeof
checks belong in libraries, such as typecastjs, because they are weak/shallow type checking.My take away:
@Naddiseo you've given me a lot to think about. You're right, we should focus on type hinting and leave type checking to the implementer, whether that be a library or otherwise.
As far as forward references go, I think there are a number of ways to handle that, but that really falls under the responsibility of the type checker.
I've been thinking about flow, and I think we should follow a subset of what they are already doing. These are my thoughts:
number
, string
, and boolean
, and void
as base types.mixed
type or union types. A variable should only ever be one type if it has a type annotation. In a string | number
situation, if you could take "5"
or 5
a conversion should be done at the call-site."foo" + undefined
"fooundefined"
5 + undefined
= NaN
"foo" + null
= "foonull"
5 + null
= 5Array
can contain any type, and that's hard to enforce. If typed arrays are added at some point, they are likely to be something like Uint32Array
. So it's better to leave that alone for now.I think we're pretty close to that already. Just have to define TypeHint
- I'm thinking a single reference, like a variable name.
Okay, let's start out with something restrictive like that as a first draft, and if the feedback we get strongly suggests developers want something more expressive, we can revisit the issue.
I propose that TypeHint be defined as follows:
TypeHint[Yield]:
[~Yield] IdentifierReference[~Yield]
number
string
boolean
void
I'll need someone with a bit better understanding of the syntax to verify the yield parameter is doing what I think it means.
We just need to resolve #12 then we'll have the syntax completed, at which point I think we can start getting more feedback.
Some questions that should be thought through to make the
TypeHint
syntax element: What can a type hint be made of? A single identifier? What about generics? What about aliases? Would be useful for specifying the full type of a callback.Something else that should be mentioned in the spec, is that it's only for hinting, not for enforcing; similar to how python annotations are spec'ed.