pigpigyyy / Yuescript

A Moonscript dialect compiles to Lua.
http://yuescript.org
MIT License
443 stars 38 forks source link

Backcalls in macros get incorrectly compiled #139

Open SkyyySi opened 1 year ago

SkyyySi commented 1 year ago

When defining a macro that 'returns' a backcall, it will not produce the expected code.

The given Yuescript code:

macro backcall = (fn) -> "() <- #{fn}()"

do
    $backcall async_fn
    print()

    () <- async_fn()
    print()

The above will compile to the following Lua code:

do
    async_fn(function() end)
    print(); --- Unrelated sidenote: Why the semicolon? Did you add it in the compiler for testing purposes and forgot to remove it?
    async_fn(function()
        return print()
    end)
end

I would expect the macro to produce the same code as the written-out version below it.


Just in the case that a more real-world example is useful: I stumbled across this issue when using this macro:

--- The code
--- 
--- ```yuescript
--- $await output_stream::write("foo", "bar")
--- ```
---
--- becomes
---
--- ```yuescript
--- (_, __async_result) <- output_stream::write_async("foo", "bar", nil)
--- output_stream::write_finish(__async_result)
--- ```
macro await = (call) ->
    object, method, args = call::match([[([a-zA-Z_][a-zA-Z0-9_]*)::([a-zA-Z_][a-zA-Z0-9_]*)%((.*)%)]])

    "
(_, __async_result) <- #{object}::#{method}_async(#{args}, nil)
#{object}::#{method}_finish(__async_result)
"

It's intended for working with Gio, a library for asynchronous IO https://docs.gtk.org/gio/method.OutputStream.write_async.html

Each successive $await call is intended to become nested in the previous backcall. However, what happens instead is that only the #{object}::#{method}_finish(__async_result) gets nested, then the block ends.

pigpigyyy commented 1 year ago

The two problems you mentioned is currently intended. The macro expanded in the middle of a code block, will be treated as a do block without an extra variable scoping. So that your code is currently expanded as:

do
  do -- without variable scoping
    () <- async_fn()
  print()

  () <- async_fn()
  print()

It is the problem caused by the compiler implemetation that macros are expanded while translating each line of statement. I think this implementation should be changed to expanding macros before translating any code to Lua code.

And the problem of the redundant semicolon is that Yuescript is adding semicolon to reduce the ambiguity caused by Lua multiline expressions.

-- the valid code in Lua
func()
(args)()
-- is the same as
func()(args)()
-- to make them two lines of statements you can make use a semicolon
func();
(args)()

Currently Yuescript is blindly adding semicolon when it sees a new line started with a left parenthe. Maybe it should check and add semicolons after all codes translated to Lua.

SkyyySi commented 1 year ago

It is the problem caused by the compiler implemetation that macros are expanded while translating each line of statement. I think this implementation should be changed to expanding macros before translating any code to Lua code.

I agree. Macros getting evaluated and expanded before compilation would be both more intuitive and more useful, I think. I personally only use them when a function would be too limiting (which is rather rare in Yuescript), and having them produce their own isolated blocks severely limits their usefulness for me. Additionally, I think it should be outlined in the documentation when exactly macros get evaluated and expanded, especially if the current behavior were to be kept.

pigpigyyy commented 1 year ago

I think it is better to explicitly write the Yuescript statement that affect the rest codes in the same scope, like global *, local * and backcall statement <- func. And I made a new commit c98c6053635ddfca7aab15b268b0f2c1fcc0c6ef trying to make backcall syntax work with macro syntax. Now you can implement your use case this way.

macro await = (call, body) ->
  object, method, args = call::match([[([a-zA-Z_][a-zA-Z0-9_]*)\([a-zA-Z_][a-zA-Z0-9_]*)%((.*)%)]])
  return "
(_, __async_result) <- #{object}::#{method}_async(#{args}, nil)
#{object}::#{method}_finish(__async_result)
#{body::sub(5)\gsub '\n\t', '\n'}
"

do
  <- $await output_stream::write("foo", "bar")
  print 123
  if cond
    func 456

compiles to:

do
  return output_stream:write_async("foo", "bar", nil, function(_, __async_result)
    output_stream:write_finish(__async_result)
    print(123)
    if cond then
      return func(456)
    end
  end)
end

In this new macro version, the expressions passed as macro function arguments are being parsed and reformatted to code strings, instead of in the former implementation they are just copy-pasted as pure text. In this manner the backcall function can be passed as code of normal function.

<- $func
print 123

now works the same as

$func ->
  print 123
SkyyySi commented 1 year ago

I like the idea of automatically re-formatting the code. But I'd actually take that further: Similarly to object::method() now becomes object\method(), I think other ambiguities should also be covered by this. For example, the != operator, which is mentioned in the very same block as the :: alias for method calling, should probably be translated to ~=. Otherwise, it seems kind of weird to have this specific case get rewritten behind the scenes, but not others. I'm not sure yet which things I would / would not have converted, but I think a good rule of thumb would be anything that's just a slightly different way to write the same thing. That would include parenthesis-less functions / function calls, bracket-less tables (seems to already be implemented), any syntax aliases (as previously mentioned, that would be :: / \ and != / ~=) and similar small things.