Closed Bilal2453 closed 1 week ago
Also added a test to catch those cases.
Worth noting some... annoyances, while writing that test I noticed if the writer returns with return true, "this value should be ignored"
coro-channel will return that the write has failed (the return of write()
will be false
indicating a failure) but at the same time it won't attempt to close the handle (since success
is true). One part of the code thinks that it succeeded and doesn't attempt to close the socket, and the other part of the code simply checks if the second return is presented to treat it as an error.
Also, a write()
could return failure indicated by either nil
or false
, it may returns either of the values depending on if the failure was picked up with the callback callback("this is an error")
or with the write return return false, "this is an error"
.
As I expanded the tests I noticed another state-related bug, consider the following code:
local ssocket = {
write = function(_, _, callback)
callback('callback failed')
return true
end,
is_closing = function()
return false
end
}
local write = require("coro-channel").wrapWrite(ssocket)
assert(write(""))
The control flow of this goes something like this:
wait
is called which resets and prepares the states and upvalues.wait()
callback is instantly called.err
is set to "callback failed"
since the callback failed.write
call now finishes, success
is set to true
(since the write call succeeded), err
is set to nil.Usually, the err
value will be set by the write
call itself (over in success, err = write(...)
, then it is reused for the callback returned error, but since in this case the callback is executed first this is no more the case.
This is now fixed by separating the call error from the callback error and tracked by two separate states.
This likely won't be the last fix we introduce for a problem like this. I also hate the complexity of this flow and it proved annoying as I got it wrong the first 3 attempts. What should the flow look like ideally if you happen to have an idea?
On Mon, 30 Sept 2024, 16:39 truemedian, @.***> wrote:
@.**** approved this pull request.
In general I dislike the state that this change introduces, which makes it significantly harder to reason about what is happening in the control flow through these functions.
However if this fixes the bug I'm fine with introducing it and writing it off for "clean up later".
— Reply to this email directly, view it on GitHub https://github.com/luvit/lit/pull/329#pullrequestreview-2337592261, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJDIIYGIZBFGFYZ3SKW2ETDZZFIARAVCNFSM6AAAAABL6MZ5LOVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDGMZXGU4TEMRWGE . You are receiving this because you authored the thread.Message ID: @.***>
@squeek502 should be ready for merge.
Consider the following minimal-reproduction code:
This will produce the following error
This happens because the write call at coro-channel.lua#L148 returns immediately, the callback produced by
wait()
is also called immediately, this callback will attempt to resume the current thread, but thecoroutine.yield
line has not yet been executed (since write finished instantly).This minimal example is basically what happens when a socket is wrapped with
secure-socket
, which will attempt to optimize empty writes by returning immediately (in the flush function over here it will execute the callback and return early).Which leads to
https
/wss
requests made by coro-net crash when code such asrequire'coro-http'.request("POST", "https://google.com", {}, "")
is ran (secure-socket sees that we are trying to write an empty chunk for the payload, so it decides to just return early and immediately since such a write will be ignored by libuv anyways).The solution suggested in this PR is to basically keep track of when the thread has yielded, and if the wait callback is called before we reach yield, then we don't bother with resuming current thread and just return, similar to what we did with
coro-split
.fixes #303.