erde-lang / erde

A programming language that compiles to Lua.
https://erde-lang.github.io
MIT License
39 stars 4 forks source link

Cannot refer to end key in table #8

Closed tp86 closed 1 year ago

tp86 commented 1 year ago

I have encountered following while trying things in playground:

local t = { end = 1 }
print(t.end)

It does not print 1 as expected.

Compiled code:

local
t
=
{
__ERDE_SUBSTITUTE_end__
=
1
,
}
print(
t
.end
)
-- Compiled with Erde 0.5-1
-- __ERDE_COMPILED__

Looks like a bug. It happens also for then. Both end and then work if used directly as variable name.

local end = 1
print(end)

prints 1 as expected. Compiled code also looks correct:

local
__ERDE_SUBSTITUTE_end__
=
1
print(
__ERDE_SUBSTITUTE_end__
)
-- Compiled with Erde 0.5-1
-- __ERDE_COMPILED__

The easiest solution seems to be wrapping all keys in table in [] and translating (effectively desugaring) accesses with . (and : which will probably be more tricky because of injecting receiver into method arguments) to [], so compiled code for the first example looks like

local
t
=
{
[
(
'end'
)
]
=
1
,
}
print(
t
[
(
'end'
)
]
)
-- Compiled with Erde 0.5-1
-- __ERDE_COMPILED__
bsuth commented 1 year ago

Ah, true, nice catch!

It happens also for then.

Yes, this bug should appear for any Lua keyword that is not a keyword in Erde.

The easiest solution seems to be wrapping all keys in table in [] and translating (effectively desugaring) accesses with .

I'm a little worried about this one for the exact reasons you mentioned with : accesses, as it may make the code actually more complicated. Looking through the code I think it will actually be easiest to simply transform all names that are Lua-only keywords, that way it will also work for things such as destructuring. When I have some time this week I'll do some testing on this one and see if I can get something up and running.

tp86 commented 1 year ago

it will actually be easiest to simply transform all names that are Lua-only keywords

What about cases like following?

local t = { ['end'] = 1 }
print(t.end)

local t = { end = 2 }
local key = 'end'  -- this would potentially be defined at runtime
print(t[key])

If you transform Lua-only keywords, above would still not work.

bsuth commented 1 year ago

You're right, i totally forgot about dynamic keys...

Main thing I'm worried about injecting self manually is that the base of the index chain may not be trivially computed:

local my_value = (mytable || my_expensive_function()):end()

In this case, the compiled code should not call my_expensive_function() more than once. One thing i want to do is to store the base in an intermediate variable:

local __erde_tmp_1__ = mytable || my_expensive_function()
local my_value = __erde_tmp_1__.end(__erde_tmp_1__)

However, if we use the index as an expression, then we may not have enough time to store the base without introducing an IIFE (something I really want to avoid)

-- no time to save the result of `mytable || my_expensive_function()` into a separate variable in the compiled code
print((mytable || my_expensive_function()).end)

The more i think about this one the more tricky it seems to have a proper solution. This one definitely warrants considerable test cases added as well. I'll see if I can think of a more rigorous solution and post again

tp86 commented 1 year ago

Main thing I'm worried about injecting self manually is that the base of the index chain may not be trivially computed

I didn't even think about this and it surely adds complexity.

However, if we use the index as an expression, then we may not have enough time to store the base without introducing an IIFE (something I really want to avoid)

I'm not sure if I follow your example. It seems to be missing some parentheses...

I'll be happy to discuss ideas and help solving this issue.

bsuth commented 1 year ago

oops fixed the parentheses :eyes:

if we want to access table fields as an expression (as opposed to a statement, such as a function call), then the compiled code should also be an expression, which makes it impossible for us to store the base as an intermediate variable. the reason that keeping the compiled code as an expression is important is that it may be conditionally executed:

print(my_var || (mytable || my_expensive_function()).end)

in the compiled code, its difficult to store mytable || my_expensive_function() in an intermediate variable, since if my_var is truthy, then my_expensive_function() should not be called at all

not sure if thats any clearer :sweat_smile:

tp86 commented 1 year ago

Yes, I get the point about indexing result of an expression, especially conditional one, I'm just not sure if I understand how IIFE is related. Do you mean to wrap whole indexing expression in anonymous function and call it immediately returning proper result? That would solve issue probably (you can do whatever you want inside function), but I would avoid it as well. It doesn't feel right, even if I'm not at all familiar with Erde.

bsuth commented 1 year ago

yup thats exactly what i mean (using IIFE in the compiled code). Erde was doing this in the early implementation days but I've removed all such instances because i didnt want to add the function call overhead, i thought it would be confusing for others to read the compiled code, and it alters the call stack, which makes error rewriting even trickier than it already is

tp86 commented 1 year ago

I have checked what Fennel does in such case.

Code:

(local t {:f (fn [self] 1)})
(print (: (or false t) :f))

(local t {:end (fn [self] 2)})
(print (: (or false t) :end))

Compiled:

local t
local function _1_(self)
  return 1
end
t = {f = _1_}
print(((false or t)):f())
local t0
local function _2_(self)
  return 2
end
t0 = {["end"] = _2_}
return print((function(tgt, m, ...) return tgt[m](tgt, ...)
 end)((false or t0), "end"))

As you can see, in case Lua keyword (end) is used, Fennel introduces IIFE (and only in such cases, which are not that many - end, then, and, or, not in case of Erde). Perhaps it's unavoidable, even if it messes with call stack and adds call overhead...

i thought it would be confusing for others to read the compiled code

Confusion can be easily remediated with proper documentation 😄 In compiled code, I think it's better to have correct implementation at the cost of readability.

I also checked Moonscript, they do not handle it (yet?):

Code:

t =
  end: () => 1

t\end()

Compiled:

local t = {
  ["end"] = function(self)
    return 1
  end
}
return t:end() -- this fails

BTW, I found this in Fennel issues:

do end is like a ; but more portable between all Lua implementations and is compiled away in bytecode

Original comment if you're interested.

bsuth commented 1 year ago

As you can see, in case Lua keyword (end) is used, Fennel introduces IIFE (and only in such cases, which are not that many - end, then, and, or, not in case of Erde).

Fennel is actually doing a real nice job in that even if a Lua keyword is used, it won't introduce the IIFE unless the index base is also a complex expression:

Code:

(print (: t :end))

Compiled:

print(t["end"](t))

I think it's better to have correct implementation at the cost of readability.

i definitely agree, just wanted to leave the IIFE as a last resort :sweat_smile:

I think until something better comes along, its very reasonable to do something similar to Fennel and introduce an IIFE when all the following are met:

  1. The index is a Lua keyword that is not an Erde keyword
  2. The index is a method call (:mymethod() syntax)
  3. The indexed expression is nontrivial (anything other than a string or name)

Examples:

(foo() || bar()):end()

(a || b):then()

Anti-Examples:

(foo() || bar()):my_method() -- my_method is not a Lua keyword, keep verbatim in compiled Lua

(foo() || bar()).end() -- not a method call, just translate to bracket index

my_table:end() -- trivial base, just copy to first arg
"my_string":end() -- trivial base, just copy to first arg

Just a note that because metamethods allow arbitrary functions to run when indexing, a.b:end() should use an IIFE.

If this sounds reasonable then i think i can start to take a look at it tomorrow, but may need time to test out some error rewriting edge cases.

also:

do end is like a ; but more portable between all Lua implementations and is compiled away in bytecode

huh, i did not know that but that is very cool :eyes: maybe need to steal this in the future

bsuth commented 1 year ago

I implemented a fix for this in https://github.com/erde-lang/erde/commit/aac6f8c1c5e7fa8989bdf95ffac9292e328ecc3d. There are still a couple limitations, which I've made a followup issue for (https://github.com/erde-lang/erde/issues/18) but they are real edge cases and I think those can wait for now. lmk if there are still any problems!

tp86 commented 1 year ago

Awesome! I'll check it and if I find something wrong, I'll let you know.