luvit / lit

Toolkit for developing, sharing, and running luvit/lua programs and libraries.
http://lit.luvit.io/
Apache License 2.0
249 stars 58 forks source link

coro-channel: resuming before having a chance to yield #303

Closed Bilal2453 closed 1 month ago

Bilal2453 commented 2 years ago

coro_http.request("GET", "https://google.com", {}, "") would error with;

cannot resume running coroutine
stack traceback:
    /mnt/data/bilal/deps/coro-channel.lua:16: in function 'assertResume'
    /mnt/data/bilal/deps/coro-channel.lua:126: in function 'callback'
    /mnt/data/bilal/deps/secure-socket/biowrap.lua:43: in function 'write'
    /mnt/data/bilal/deps/coro-channel.lua:148: in function 'write'
    /mnt/data/bilal/deps/coro-http.lua:149: in function </mnt/data/bilal/deps/coro-http.lua:106>
    [C]: in function 'xpcall'
    [string "bundle:/deps/repl.lua"]:97: in function 'evaluateLine'
    [string "bundle:/deps/repl.lua"]:189: in function <[string "bundle:/deps/repl.lua"]:187>
stack traceback:
    [C]: in function 'error'
    /mnt/data/bilal/deps/coro-channel.lua:16: in function 'assertResume'
    /mnt/data/bilal/deps/coro-channel.lua:126: in function 'callback'
    /mnt/data/bilal/deps/secure-socket/biowrap.lua:43: in function 'write'
    /mnt/data/bilal/deps/coro-channel.lua:148: in function 'write'
    /mnt/data/bilal/deps/coro-http.lua:149: in function </mnt/data/bilal/deps/coro-http.lua:106>

While the request is very wrong (providing a body on GET) it shouldn't have produced such error. I am afraid it is affecting wider use-cases of the library.

I will be looking deeper into this.

Bilal2453 commented 2 years ago

coro-channel.lua#L126:

  local function wait()
    local thread = coroutine.running()
    return function (err)
      print("reaching assertResume")
      assertResume(thread, err)
    end
  end

coro-channel#L149:

    print("reaching socket:write")
    local success, err = socket:write(chunk, wait())
    if not success then
      closer.errored = err
      closer.check()
      print("reaching return")
      return nil, err
    end
    print("reaching coroutine.yield")
    err = coroutine.yield()
    return not err, err

Outputs:

reaching socket:write
reaching coroutine.yield
reaching assertResume
reaching socket:write
reaching assertResume
cannot resume running coroutine
stack traceback: etc...

So it seems like it is trying to resume before reaching yield (a trend at this point in many of the coro-* libs), although I am not sure why this happen when body is provided in the above scenario, probably something similar to what coro-split used to do, the fix can as well be the same one we did for coro-split implementing a lock on yield, and only resuming if it was locked.

Should this locking mechanism become part of assertResume/utils perhaps? it is not that often we see this error around but it is also not that rare.

Bilal2453 commented 2 years ago
$ luvit
Welcome to the Luvit repl!
> require'coro-http'.request("POST", "https://google.com", {}, "")
cannot resume running coroutine
stack traceback:

As I assumed, it was affecting a wider range. The cause is basically the body being too small, looks like it finish writing before yielding as explained before.

Will fix this in the same way we did it for coro-split.. tomorrow hopefully.