mbebenita / LLJS

LLJS: Low-Level JavaScript
http://lljs.org
Other
1.18k stars 89 forks source link

casting requires extra set of parentheses #22

Open jlongster opened 12 years ago

jlongster commented 12 years ago

It's annoying to have to write:

(int)(num)
(int)(getNum())

etc, every time I want to cast something. I should be able to just do:

(int)num
(int)getNum()

but it seems to require the explicit parentheses.

mbebenita commented 12 years ago

I hear ya, but it seems to be the only way to parse without having to require type declarations. This is a problem with C style casting syntax. We actually have to parse assuming they are function calls and then disambiguate the casts after the parser is done.

jlongster commented 12 years ago

I don't know much about esprima, but how else can (int)num be parsed? Is that parsed in a way that can't be used?

I guess it seems like you could parse (, the type, and ), and then parse the next expression and wrap it in a cast. I'm sure I'm over-simplifying this. It seems like this could work since the compiler knows what types are at compile time (or are they not known at parse time?)

syg commented 12 years ago

Parsing C-style casts are really a pain in the ass. There are a couple of problems with C-style casts:

  1. Ambiguity with call expressions. (e1)(e2) could be a call or a cast, depending on whether e1 is a type or not. C solves this by requiring all types be forward-declared.
  2. Different associativity. Casts associate right to left:

    (e1)(e2)(e3) is really ((e1)((e2)(e3)))

    Calls associate left to right:

    (e1)(e2)(e3) is really (((e1)(e2))(e3))

    So to disambiguate calls and casts without knowing which identifiers are types ahead of time would actually require reparsing. Note that I actually mean reparsing the entire file, not backtracking, because it's only after we parse the entire file that we know what the types are.

So there are a couple of solutions here:

  1. [Current behavior] Say that casts are really special cases of call expressions. They parse just like call expressions and are disambiguated after parsing. I think we chose this because it feels the most JavaScript-y, i.e. to convert values to numbers in plain JS you do Number(e).
  2. Could allow juxtaposition-style casts such as (e1)e2, but this would not associate in the C way. That is, (e1)(e2)e3 would be a parsing error since the parser doesn't know what to do with ((e1)(e2))e3.
  3. Don't use C-style casts. Could introduce an as operator, like e1 as ty.
  4. [I refuse to do this] Require forward declarations.

I personally like (c), but that would be a radical syntactic departure from C.

edit: formatting

jlongster commented 12 years ago

I actually didn't realize that e1(e2) was valid for casting in lljs. Although it's not the same as C, I think it's good enough. The main annoyance was having to type 4 parentheses just to cast. So int(foo) isn't too bad, and like you said it is kind of how javascript conversions are already done.

I also didn't realize that lljs didn't require forward declarations, which is kind of neat. It makes sense that C-style casting is difficult without knowing all the types.