mitchellh / libxev

libxev is a cross-platform, high-performance event loop that provides abstractions for non-blocking IO, timers, events, and more and works on Linux (io_uring or epoll), macOS (kqueue), and Wasm + WASI. Available as both a Zig and C API.
MIT License
2.07k stars 74 forks source link

Reading exactly N bytes from a file? #54

Closed penberg closed 1 year ago

penberg commented 1 year ago

I have the following example code that attempts to read 100 bytes from a file:

const f = try xev.File.init(file);
var read_buf: [100]u8 = undefined;
var read_len: ?usize = null;
var c_read: xev.Completion = undefined;
f.read(loop, &c_read, .{ .slice = &read_buf }, ?usize, &read_len, (struct {
    fn callback(
        ud: ?*?usize,
        _: *xev.Loop,
        _: *xev.Completion,
        _: xev.File,
        _: xev.ReadBuffer,
        r: Self.ReadError!usize,
    ) xev.CallbackAction {
        ud.?.* = r catch unreachable;
        return .disarm;
    }
}).callback);
try loop.run(.until_done);
assert(read_len == 100);

However, the assertion fails because read_len is 1. This is, of course, allowed by the read() system call (I am on mac) so perhaps a new read_exact() API is needed for this use case?

penberg commented 1 year ago

Note: the above code works fine on Linux, so perhaps this is just a macOS issue and read() is semantically supposed to mean read exactly?

mitchellh commented 1 year ago

Hm, so on macOS, we use kqueue, and this is the behavior when kqueue notifies us of a read:

https://github.com/mitchellh/libxev/blob/cabdf638cb5a6f25c0c8476e98729f9e72537181/src/backend/kqueue.zig#L1194-L1206

So we're just calling os.read, which is just calling libc read if libc is linked. πŸ€”

penberg commented 1 year ago

Yeah. The read() system call is always allowed to return a short read, but this doesn't make much sense to me. The read is from the beginning of a file so it doesn't even cross page boundary...

mitchellh commented 1 year ago

Yeah. The read() system call is always allowed to return a short read, but this doesn't make much sense to me. The read is from the beginning of a file so it doesn't even cross page boundary...

Right right, sorry I know its allowed I agree with the latter it just doesn't make sense. πŸ€”

penberg commented 1 year ago

This is turning out to be a fun bug to chase!

I added few std.debug.print() calls:

        std.debug.print("BEFORE: read_len = {?}\n", .{read_len});
        try loop.run(.until_done);
        std.debug.print("AFTER: read_len = {?}\n", .{read_len});

and now can see that no read() system call is issued during try loop.run(.until_done);:

kqueue(0x0, 0x0, 0x0)            = 3 0
openat(0xFFFFFFFFFFFFFFFE, "tests/hello.db\0", 0x1020000, 0x0)           = 4 0
write(0x2, "BEFORE: read_len = \0", 0x13)                = 19 0
write(0x2, "null\0", 0x4)                = 4 0
write(0x2, "\n\0", 0x1)          = 1 0
kevent64(0x3, 0x16F29B3A0, 0x1)          = 0 0
kevent64(0x3, 0x16F29B3E0, 0x0)          = 0 0
write(0x2, "AFTER: read_len = \0", 0x12)                 = 18 0
write(0x2, "1\0", 0x1)           = 1 0
write(0x2, "\n\0", 0x1)          = 1 0

So now there are two mysteries to solve: (1) why don't we issue a system call and (2) why was read_len set to 1...

mitchellh commented 1 year ago

takes-glasses-off-deal-with-it

penberg commented 1 year ago

Well, the bug reproduces with libxev tests if you don't give the event loop a thread pool...

diff --git a/src/watcher/file.zig b/src/watcher/file.zig
index f7c3dde..99e0443 100644
--- a/src/watcher/file.zig
+++ b/src/watcher/file.zig
@@ -364,10 +364,11 @@ pub fn File(comptime xev: type) type {

             const testing = std.testing;

-            var tpool = main.ThreadPool.init(.{});
-            defer tpool.deinit();
-            defer tpool.shutdown();
-            var loop = try xev.Loop.init(.{ .thread_pool = &tpool });
+            //var tpool = main.ThreadPool.init(.{});
+            //defer tpool.deinit();
+            //defer tpool.shutdown();
+            //var loop = try xev.Loop.init(.{ .thread_pool = &tpool });
+            var loop = try xev.Loop.init(.{});
             defer loop.deinit();

             // Create our file
@@ -425,6 +426,7 @@ pub fn File(comptime xev: type) type {
             }).callback);

             try loop.run(.until_done);
+            try testing.expectEqual(read_len, write_buf.len);
             try testing.expectEqualSlices(u8, &write_buf, read_buf[0..read_len]);
         }

So @mitchellh any ideas why we'd skip the read() system altogether on Darwin if there's no configured thread pool?

mitchellh commented 1 year ago

Oh interesting. Actually yes, kind of. The File abstractions require a threadpool right now. Without a threadpool, we set the result to EPERM. EPERM is "-1", but we are forgetting to negate the value. So, read len is 1. πŸ˜„

penberg commented 1 year ago

@mitchellh Ok, awesome, I can just turn on the thread pool now in my silly app. I hope there's a way to somehow error out if someone attempts to use files without the thread pool.

mitchellh commented 1 year ago

Oh this is totally busted. I'm using os.errno() which on darwin reads the errno thread local variable which is unset in this case because I'm manually constructing an errno. 🀑 Ugh. Fixing this now.

mitchellh commented 1 year ago

Fix is up in #57. Yes, you can detect this because the error.PermissionDenied error will be yielded to the completion. We use EPERM but open to other ideas, we have to use a standard error.

Jarred-Sumner commented 1 year ago

I suggest not using std.os for this, as it’s use of unreachable and not directly exposing errno will cause reliability issues in production