usethesource / rascal

The implementation of the Rascal meta-programming language (including interpreter, type checker, parser generator, compiler and JVM based run-time system)
http://www.rascal-mpl.org
Other
406 stars 78 forks source link

Added warning for user trying to reify a literal type parameter `#&T` #1463

Open stefanbohne opened 3 years ago

stefanbohne commented 3 years ago

Describe the bug

When using a type parameter as a reified type (#&T) it appears to be void instead of the actual type.

To Reproduce

Rascal Version: 0.18.2, see |release-notes://0.18.2|
rascal>bool f1(type[&T] _, value x) { return &T _ := x; }
bool (type[&T], value): function(|prompt:///|(0,50,<1,0>,<1,50>))
rascal>f1(#int, 1)
bool: true
rascal>f1(#int, "1")
bool: false
rascal>bool f2(&T x) { return f1(#&T, x); }
bool (&T): function(|prompt:///|(0,36,<1,0>,<1,36>))
rascal>f2(1);
|prompt:///|(46,1,<1,46>,<1,47>): Expected void, but got value
Advice: |http://tutor.rascal-mpl.org/Errors/Static/UnexpectedType/UnexpectedType.html|
ok

Expected behavior I would expect &T to be replaced by the actual type in all usages. So the last line above should return bool: true.

Desktop (please complete the following information):

jurgenvinju commented 3 years ago

Thanks for mentioning this @stefanbohne ; it needs clarification.

the # operator is a compile-time operator, similar to myJavaClass.class class literals in Java, so reifing a type parameter actually produces a representation of the open parameter:

rascal>#&T
type[void]: type(
  parameter(
    "T",
    value()),
  ())

We chose to bind the parameter of type[...] with void to make sure a reified parameter does not give any static guarantees (after all it may become void at run-time). This explains your error message "Expected void", however I'm not so sure about the "but got value" part. I'll look into that to see if we can clear up the error message.

stefanbohne commented 3 years ago

If #&T is always the same as #void, why do you even allow this syntax? It's just confusing.

jurgenvinju commented 3 years ago

It sure is confusing! I think the static-checker could provide a warning if people type this in. For completeness sake, however reified type parameters should exists next to #void. It is a meaningful representation of higher-order types used in the implementation of Rascal itself and other tools like XML and JSON (de)serializers which use type parameters, while it does not have a common use case for most programmer.

Just to be clear, #&T produces a quite different value from #void. What is the same, is that the type parameter of the type[&U] constructor &U is bound to void both for #void and for #&T.

jurgenvinju commented 3 years ago

Idea: When a user literally types #&T produce a warning: "Reifying a type parameter literal produces void binding"

For lack of a better sentence. @PaulKlint do you have better words?

stefanbohne commented 3 years ago

So, is there way to make the above function f2 work? I can only find typeOf(x) (instead of #&T), but that returns a Symbol instead of type[&T].

jurgenvinju commented 3 years ago

You mean this right:

Indeed, this will not work. You can not travel back in time and recover the static type at compile-time from the dynamic type at runtime of the x parameter of f2. BTW, this would require inter-procedural (=unscalable) static analysis at the least or severe restrictions in the type system (i.e. to type preserving functions only).

Also Rascal does not have rank-2 higher-order types to reason about all possible instances of &T at run-time.

However, pattern matching is always a way to compare dynamic types to static types, so you might say that the pattern matching feature implements exactly what you want. It is not a first-class functionality which you can encapsulate and wrap in a higher-order function, but you can use it everywhere to "narrow" down to the type you need:

value x = ... ;
  if (MyType _ := x) {
    ... 
  }

or better:

value x = ...

myFunc(x); // allowed and will dispatch to the next first alternative if x is dynamically of type MyType
void myFunc(MyType x) { do something with x }

default void myFunc(value x) { throw "did not expect this"; }  // this happens if the parameter is not of type `MyType' at run-time
jurgenvinju commented 3 years ago

In other words: you need type literals (which are statically the same as dynamically) somewhere in your code. Either #int passed as a parameter to f1, or int _ := x as the predicate of a type match.

The thing that the # operator introduces is lifting the literals to values, such that they can be passed around by functions, and the binding of its own type parameter to exactly the type it represents in its current scope is also a good trick.

stefanbohne commented 3 years ago

Ok, so if I keep an instance of the reified type around, instead of using #&T it should work.

stefanbohne commented 3 years ago

I don't know if this is related or not, but consider the following function:

bool f3(void x) { return true; }

Calling it gives an error as expected:

rascal>f3(42)
|prompt:///|(3,1,<1,3>,<1,4>): CallFailed(
  |prompt:///|(3,1,<1,3>,<1,4>),
  [4])
        at $root$(|prompt:///|(0,32,<1,0>,<1,32>)ok

Now consider the following parametric function:

bool f4(type[&T] t, &T x) { return true; }

It's behavior is very different.

rascal>f4(#void, 42)
bool: true

I don't understand how the last line passes the type checker.

jurgenvinju commented 3 years ago

The last line passes because the least upperbound of void and int is int, because void is at the bottom of the hierarchy (lattice). &T is bound to int. This solution for the type bindings is also made possible because type[&N] is covariant and so type[void] is a proper subtype of type[int].

And so the latter function has no void parameter like the other one has. The typechecker should forbid literally void parameters btw.

stefanbohne commented 3 years ago

Yes, right. f4 was not very insightful.

Sorry, I'm just trying to understand Rascal's type system, and I feel like I'm missing something here.

I found the following type checks and throws an error at runtime when called with something other than int.

int f6(&T x) { return x; }

Is this behavior documented somewhere?

jurgenvinju commented 3 years ago

Thanks @stefanbohne you found a bug in the type-checker. This is definitely not right. Since &T is not guaranteed to be bound to a sub-type of int, return x should have triggered an error.

jurgenvinju commented 3 years ago

Will you submit another issue with this example?

stefanbohne commented 3 years ago

Regarding f4 again: what is the variance (co- / contra-) of type parameters? So type has to be covariant so that f4 works as you described. Is that true for all type parameters? It's definitely not true for function types. (This is what brought me to f6, btw).

jurgenvinju commented 3 years ago

let's see:

The latter is due to function partiality and pattern matching in Rascal. Functions are partial in principle and will throw a CallFailed at run-time if none of the overloaded cases match their parameters (or all functions throw a fail and backtracking does not provide a non-failing function execution). To be able to pass partial functions as higher-order parameters it is essential that they are both co- and contra-variant.

jurgenvinju commented 3 years ago

more explanations in this RAP: https://docs.google.com/document/d/1zSGTlEFwg2uDlYAXX6DtDzMjEshsk5rrRjYxerNdRG4/edit?usp=sharing

stefanbohne commented 3 years ago

Thank you. That was quite interesting.

That means that int(int) and int(void) and int(value) are all subtypes of each other? But then the subtype relation is not even a partial order, let alone a lattice.

jurgenvinju commented 3 years ago

It means that function types are not ordered in the dimension of their parameters if they share the same arity, but this leaves the rest of the lattice intact.

There are of course also incomparable function arguments, and different arities. Comparable function types still have a least upperbound and lowerbound for their parameter types which can be chosen arbitrarily along the lattice but we chose value for the lub and void for the glb in case of (pairwise) imcomparability and simply the highest type in case of comparability for every argument position.

Function partiality breaks type system completeness; here we've chosen for a gradual collapse.