ocaml-multicore / eio

Effects-based direct-style IO for multicore OCaml
Other
528 stars 67 forks source link

Eio.Path.load causes "file descriptor used after calling close!" error #740

Closed jonsterling closed 3 weeks ago

jonsterling commented 3 weeks ago

I was experimenting with some of my code to test the performance impact of reading some text from a channel, vs. using Eio.Path.load, and I noticed something very strange. When I changed my code to use Eio.Path.load, I consistently get the following exception:

Fatal error: exception Invalid_argument("open_beneath: file descriptor used after calling close!")

Now, it strikes me that this exception should not be possible if all I am doing is calling load, which should surely be managing its own resources. Is this expectation correct, or is there a reason why this might be happening?

EDIT: I notice this also happens when I use with_open_in, which I guess makes sense because load is a wrapper of this. But I am puzzled about how it could be happening nonetheless.

talex5 commented 3 weeks ago

Is this expectation correct, or is there a reason why this might be happening?

The path argument to load is a pair of a base directory and a path relative to that. If the base directory was opened (e.g. using open_dir) and later closed then the path cannot be used.

jonsterling commented 3 weeks ago

@talex5 Thanks, that is extremely helpful, I did not realise that the path itself is an ephemeral resource; I had wrongly assumed it was just pure data that can be used whenever (e.g. to acquire an actual resource, like a flow).

I think that your guess is correct, as in my code I have one routine that scans a directory to get a bunch of paths to work with, and then another routine that does something with the accumulated paths. Naturally, the directory resource is closed before the second routine runs. Do you have any recommendations about how to restructure this code to work properly with Eio's model?

talex5 commented 3 weeks ago

Note: open_dir creates a restricted sandbox environment for the path. If you don't need that, you can skip using open_dir.

Assuming you do want that, you can either pass a switch to the scanning routine, so that the opened directories remain open after it returns, or you could delay calling open_dir until you want to use the paths (so the scanner returns unconfined paths; this may also reduce the number of open FDs).

jonsterling commented 3 weeks ago

Oh! That's really helpful, thanks! For some reason I thought I needed to call open_dir. I appreciate your help!

kentookura commented 2 weeks ago

@talex5 Thanks, that is extremely helpful, I did not realise that the path itself is an ephemeral resource; I had wrongly assumed it was just pure data that can be used whenever (e.g. to acquire an actual resource, like a flow).

@jonsterling @talex5 Could it then be that the usage of this function is causing some issues?

https://git.sr.ht/~jonsterling/ocaml-forester/tree/main/item/bin/forester/main.ml#L8

On windows, I am hitting a exception Unix.Unix_error(Unix.ENOENT, "openat", "")

https://git.sr.ht/~jonsterling/ocaml-forester/tree/main/item/bin/forester/main.ml#L48 https://git.sr.ht/~jonsterling/ocaml-forester/tree/main/item/bin/forester/Process.ml#L8

In my work then, instead of https://git.sr.ht/~jonsterling/ocaml-forester/tree/main/item/lib/frontend/Parse.ml#L143

I have

let parse_file fp =
  let content = Eio.Path.load fp in
  parse_string content

which results in the above error.

talex5 commented 2 weeks ago

@kentookura I'm adding an example of reading files to examples/fs in #745. That's working on Windows in CI; does it work for you?

kentookura commented 2 weeks ago

Yes, the example works, thanks.