teal-language / tl

The compiler for Teal, a typed dialect of Lua
MIT License
2.14k stars 108 forks source link

Reasoning behind backtick annotations in generics? #76

Closed pdesaulniers closed 4 years ago

pdesaulniers commented 4 years ago

I find that the syntax for generics is a bit noisy. Is there a technical reason for backticks before type names?

For instance, instead of this:

local function keys<`K,`V>(xs: {`K:`V}):{`K}

I would prefer this:

local function keys<K, V>(xs: {K: V}): {K}

This would resemble the syntax for generics in many popular languages (C#, TypeScript, C++ templates, Java, etc.)

hishamhm commented 4 years ago

Is there a technical reason for backticks before type names?

At one point it did, because functions did not have the declaration bit (the <> section after the function name), so type variables worked via a sort of local-by-default, and the backticks served the purpose of keeping them in a separate namespace from other undeclared variables.

To make things lexically scoped in nested functions, I added the explicit <> declarations. I think the backticks are not necessary now — it would be nice to make a PR removing support for them to check that everything keeps working.

pdesaulniers commented 4 years ago

OK, I'll try to take care of this.

pdesaulniers commented 4 years ago

I'm kinda confused. If we remove the backticks, then parse_typevar_type does not get called anymore: https://github.com/hishamhm/tl/blob/a5f4d34d56eec7df84c487cfc9e9ce28820cff35/tl.tl#L1181-L1182

What would be the best way to differentiate typevars from regular types in the parser?

resolritter commented 4 years ago

I'm kinda confused. If we remove the backticks, then parse_typevar_type does not get called anymore:

https://github.com/hishamhm/tl/blob/a5f4d34d56eec7df84c487cfc9e9ce28820cff35/tl.tl#L1181-L1182

What would be the best way to differentiate typevars from regular types in the parser?

disclaimer: I don't really understand parsing steps of compilers and have never worked in one. just chiming in as some hopefully helpful food for thought.

@pdesaulniers I think he means there's already a "phase" transitioning the parser into a state where it would pick those tokens up without the backticks, e.g.

https://github.com/hishamhm/tl/blob/a5f4d34d56eec7df84c487cfc9e9ce28820cff35/tl.tl#L1109

https://github.com/hishamhm/tl/blob/a5f4d34d56eec7df84c487cfc9e9ce28820cff35/tl.tl#L1114

You might get a clearer picture if you just open tl.tl and search for "<" or ```. That's what I did, even though I do not understand the subject properly; but I think @hishamhm is implying that this detection of generic type arguments is already in place, so you could try just removing code related specifically to handling the backticks and checking if it still works.

hishamhm commented 4 years ago

What would be the best way to differentiate typevars from regular types in the parser?

That's the thing, without the backticks, you pretty much can't differentiate them at the parsing stage.

If we go by proper parsing theory, the goal of the parser is to create the syntax tree of Node objects. The type checker is the one who traverses this tree of Nodes and produces Type objects.

The fact that there are places in the parser where it creates Type objects directly is a bit of... well, call it what you want: optimization, simplification, hack :)

When we remove the backticks, then it becomes really impossible at parsing time to determine if T is a type variable or just a nominal (unless you cheat even more and start to keep tracking of the symbol table as you parse — then you're really mixing parsing and type checking (mind you, some single-pass compiler implementations can do that (think old-school compilers like the original Pascal), but the language needs to be designed to allow that and we're doing separate passes for parsing and type checking anyway (enough nested parentheses, this is starting to look like Lisp :) ))).

So yeah, parse_typevar_type would go away from the parser, and variables would have to be resolved into type variables in the appropriate places in the type checker (which I think it might already do since the type variables are now being stored in the symbol table st alongside other variables).

resolritter commented 4 years ago

@hishamhm Apologies if I don't understand it correctly, but you're saying that in places where the "<" token is found, e.g.

https://github.com/teal-language/tl/blob/6e9c49460ee4e4b255541c488bdc4b3fd635efb3/tl.tl#L1757

in the called function, you'd have to do matching in the symbol table to figure out if the identifier is a nominal type or a type argument?

If that's the case, then I personally find weird for it to be a concern, because languages I've seen so far just disregard whatever previous type identifier with the same name (if any).

Rust

#[derive(Debug)]
struct T(i32);

#[derive(Debug)]
struct Point<T> {
    t: T
}

fn main() {
    println!("{:?}", T(1));
    println!("{:?}", Point::<u32> { t: 1 });
}

TypeScript

type T = true

type Generic<T> = T

type New = Generic<false>

Java

class T {}

class THolder<T> {
  T t;

  public THolder(T initial) {
    this.t = initial;
  }
}

class Main {
  public static void main(String[] args) {
    System.out.println(new THolder<Integer>(1).t);
  }
}
hishamhm commented 4 years ago

languages I've seen so far just disregard whatever previous type identifier with the same name

@resolritter ah yeah, that's not what I meant! What I believe you are referring to (the fact that T in class THolder<T> is not the same as class T, etc) is lexical scoping. And yeah, that works as expected. If you nest declarations, the inner one is that one that's active, and that <T> is effectively nested inside the class declaration.

What I meant was that this was a bit of a concern when type declarations did not have <> and the use of generics was implied by the presence of backtick-names in the use sites. I don't believe it should be a problem now, but the un-backticked names will register in the AST as nominal now which need to be resolved into typevar types at typechecking time now (whereas they were pre-resolved as typevar at parsing stage due to the backticks).

@pdesaulniers Would you like me to take a stab at removing the backticks?

pdesaulniers commented 4 years ago

@hishamhm Yes, sorry, go ahead. Last time I checked, I couldn't figure out how to resolve the typevars at typechecking time :dizzy_face:

I'm having some trouble understanding the program's flow. I think debugger support in the VSCode extension would be quite useful here.

hishamhm commented 4 years ago

I'm having some trouble understanding the program's flow. I think debugger support in the VSCode extension would be quite useful here.

Indeed! One way to get there would be generating source maps that could be consumed by https://github.com/pkulchenko/MobDebug/ (I wonder if there's any VSCode extension for Lua that uses mobdebug already with some code that could reused?)

hishamhm commented 4 years ago

Merged #90 ! Will tag 0.3.0 with it

hishamhm commented 4 years ago

Will tag 0.3.0 with it

@pdesaulniers On second thought, could you give the latest master a spin with this change merged in before I tag 0.3.0?

pdesaulniers commented 4 years ago

@hishamhm It appears to work as intended! At least, my declaration files type-check correctly without the backticks.

hishamhm commented 4 years ago

Great to hear, thank you! Will push 0.3.0 later today!