luvit / luv

Bare libuv bindings for lua
Apache License 2.0
815 stars 184 forks source link

File system requests for nonexistent paths may result in invalid userdatum #718

Open rdw-software opened 2 days ago

rdw-software commented 2 days ago

It seems that opening a handle with a path that doesn't exist can cause crashes when (potentially unrelated) code later tries to access it:

local handle
handle = uv.fs_open("does-not-exist", "r", 438, function(err, fd)
    print(err, fd)
    print(type(handle))
    print(handle)
    error("this is never executed")
end)

print(type(handle))
print(handle)

uv.walk(function(_)
    error("this is never executed either")
end)

The program is halted with an assertion failure, which is triggered by a somewhat unreliable sanity check in src/loop.c:

userdata
uv_fs_t: 0x01f4064047a8
Assertion failed: data && data->ref < 0x1000000, file /tmp/luv/src/loop.c, line 98

Occasionally, this sanity check fails and the program simply SEGFAULTs when trying to invoke the callback:

userdata
uv_fs_t: 0x02ddabb3c7d8
[1]    9817 segmentation fault

The handle itself appears to be active, but its context (data->ctx) is NULL. There were some issues mentioning cleanup and callback interactions; this may be related. Attempting to use the handle inside the callback, without uv.walk, also fails (uv.run() is implied):

local handle
handle = uv.fs_open("does-not-exist", "r", 438, function(err, fd)
    print(err, fd)
    print(type(handle))
    print(handle)
    error("this is never executed")
end)

The resulting Lua error is technically preventable since the err and fd values indicate a failure, but it could still be slightly confusing:

ENOENT: no such file or directory: does-not-exist       nil
userdata
Uncaught Error: bad argument #1 to '?' (Expected uv_req_t)

I encountered this issue while queuing FS requests with some observability code enabled, which required walking all handles. There's no obvious way of guarding against these crashes, so that even a simple typo in a file system path could bring down the entire application.

Tested on Windows and Linux (Ubuntu) using the latest LuaJIT build. This doesn't look like a platform-specific problem, though.

Bilal2453 commented 2 days ago

Have you tested this with Luvit? I was just trying to use this code with Luvit but it seems to work fine, errors with "this is never executed either" in the walk callback.

luvit version: 2.18.1
luvi version: v2.14.0-dev+9b697da
ssl version: OpenSSL 1.1.1m  14 Dec 2021, lua-openssl 0.8.2
winsvc version: 1.0.0
rex version: 8.37 2015-04-28
libuv version: 1.44.2
truemedian commented 2 days ago

My first assumption here is that this has something to do with walking the handles before running the event loop.

truemedian commented 2 days ago

The Uncaught Error: bad argument #1 to '?' (Expected uv_req_t) that happens in the fs_open callback is reasonable (although maybe unexpected) because the req becomes invalidated before the callback is called.

I'm also more confused at how you've gotten uv_walk to segfault when this program creates no handles (handle is a uv_req_t, not a uv_handle_t)

rdw-software commented 10 hours ago

The SEGFAULT is a red herring; there's an unrelated problem (not apparent from the example) which I've partially solved for the uv.walk() case. I'll create a new issue to discuss this scenario shortly, as properly supporting it may require more changes.

Additionally, the example did SEGFAULT using luvit on Windows even after working around the secondary problem. But I was using an outdated release and one of the event loop-related changes applied to luv since must have fixed it. There are various oddities I've observed that I still want to look into, but I'm not sure they're something that needs to be fixed in luv or libuv itself.

I'll leave the issue open for now. No need to spend time on this, though. I'll hopefully get back to it soon... (famous last words)

rdw-software commented 9 hours ago

Regardless of the above, there may be some value in addressing the other points:

The Uncaught Error: bad argument #1 to '?' (Expected uv_req_t) that happens in the fs_open callback is reasonable (although maybe unexpected) because the req becomes invalidated before the callback is called

I guess the message could be confusing if one doesn't know much about how luv works internally. If it's expected that the request could become invalid while still accessible to the Lua program, maybe a more descriptive error message could be emitted? I rarely use userdata outside of luv, but IIRC the standard library detects that resources are no longer valid for builtins like io.open.

My first assumption here is that this has something to do with walking the handles before running the event loop.

This is a good idea and indeed one of the oddities I observed is that running the event loop intermittently can change the result. On Windows, I saw that calling uv.run() before uv.walk() did prevent the crash. It didn't on Ubuntu, though. Will revisit later.

I'm also more confused at how you've gotten uv_walk to segfault when this program creates no handles (handle is a uv_req_t, not a uv_handle_t)

My bad. I tried to reduce a more complex example, but I forgot about things like the unref'd SIGPIPE handler (uv_signal_t).

Edit: I'll leave the original example as-is even if it doesn't create any handles. The rest will become clearer with more context.