Open aantron opened 3 years ago
First of all many thanks for putting this roadmap together @aantron !
You do not mention thread-safety of libuv as a benefit when multicore OCaml is just around the corner. Won't libuv largely help in preparing Lwt to work in multicore paradigm?
@Lupus that's right. I couldn't remember all the usual benefits :) the libuv API makes it easy to run multiple event loops, one per system thread. It's not quite thread-safety — that's still the responsibility of the libuv API user (whether a person or a higher-level library). You still have to create the multiple event loops and submit work from each thread to the right loop. libuv offers a cross-platform thread-local storage API that should ease various kinds of integration arrangements. The binding Luv has an issue about registering GC roots in libuv's TLS if the binding's API is changed to accept OCaml values.
One Lwt "instance" per real thread sounds ideal to me :blush: I bet libuv offers rich API for communicating across threads, so one can easily arrange individual Lwt loops to communicate and do the work at multiple cores, while within each core it would be familiar look and feel of non-preemptive concurrency. Interesting edge cases include non-blocking wait of a real mutex or condition, probably some real thread will have to be sacrificed to block on it, and notify some Lwt loop(s) about completion.
I bet libuv offers rich API for communicating across threads
uv_async_t
for delivering a message to another loop (potentially in another thread). The OCaml version is Luv.Async
. It may be time to start to flesh out the OCaml docs a bit more :)
non-blocking wait of a real mutex or condition
AFAIK, yes. libuv provides cross-platform wrappers of real mutexes and conditions (libuv, Luv), but they are sort of separated from the whole "loop"-based API, and are just standalone wrappers. So there is no direct way to do a non-blocking wait on them using libuv.
At least Windows has WaitForMultipleObjects
that AFAIK allows this :)
Hi @aantron, this may be a non-issue on account of a misunderstanding on my part, but how does the vendoring in of libuv
play out in the following scenario:
Lwt
in my program, which uses libuv
(vendored version 1) via Luv
libuv
(non-vendored version 2) under its hoodAt link time, I have multiple definitions of libuv
symbols. I imagine this might require some careful crafting of the link-line to ensure that the correct symbols are used in the correct place(s)? In a largish executable, sometimes you do not know what dependencies your dependency manager has brought in - so the careful crafting may be fragile?
Also, similar to this question in the python world, would it be a) possible, and b) beneficial, in any case, to be able to share the libuv
loop between libraries, e.g. consider the potential for a libuv
based adapter for hiredis - does the decision to vendor libuv
impact that? How would you share event loops from two different source versions of libuv
?
As I understand it, code shouldn't be linked against two versions of libuv or two instances of libuv at all.
Ideally, the whole project could be configured to use the vendored libuv. The archive and headers are installed in the opam switch/esy sandbox.
Alternatively, we could provide a way for Luv to be built against an external libuv.
The current integration of Lwt with libuv uses libuv's default loop, so it is already shared and visible to anything else that wants to use it (and everything does, by default). Eventually, we will create additional loops for additional threads.
How would you share event loops from two different source versions of libuv?
This seems highly questionable at first. Can you link to or describe an instance where this is done? The issue is that libuv isn't a "leaf" library, but more like a "framework," since it takes over driving the application and its I/O. Unless the loops are running in different threads, I don't see how (practically) one would be able to use two different loops at all, or would want to, etc.
For ocaml-hiredis
, you would simply have the adapter depend on luv
(or just lwt
like it already does, if/when lwt
itself depends on luv
, though you may still want to make the dependency explicit). Then, you could write any C code against the vendored libuv, use the vendored headers, and gain access to the same loops used anywhere else in the final linked program (as long as you could get a reference to them). You can trivially get access to the default loop through libuv's APIs, whether in C, or in OCaml through the bindings exposed by Luv. In this case, the decision to vendor libuv should simplify using it.
The other discussion you linked to doesn't seem to be about having multiple libuvs linked in, but about sharing one libuv between multiple libraries. That's trivially possible with the vendored libuv as described above.
You may be interested in Luv's depending on headers test, which is a trivial OCaml program meant to be installed in opam (or esy) in the usual way, which has some C stubs that access the vendored libuv.
If binding to a bigger C library, you would ideally configure it to find the vendored headers and archive in the opam switch, and build it (essentially the same as Luv does with libuv).
As I understand it, code shouldn't be linked against two versions of libuv or two instances of libuv at all.
Ideally, the whole project could be configured to use the vendored libuv. The archive and headers are installed in the opam switch/esy sandbox. Alternatively, we could provide a way for Luv to be built against an external libuv.
Thanks @aantron . I think we might need a way for
Luv
to be built against an externallibuv
at my organization (we don't useopam
). Our package management is based ondpkg
and there will be only one version oflibuv
used by all packages (OCaml or otherwise) in a distribution - and we can't dictate which one. Before discussing potential solutions though, let me try packagingLuv
for our package manager, in its current state and get back to you on what, if any, issues I face.If binding to a bigger C library, you would ideally configure it to find the vendored headers and archive in the opam switch, and build it (essentially the same as Luv does with libuv).
I don't think I can get the C library build configuration to care about the existence of opam
, or even OCaml
- it's distributed as pre-built artifact (.deb
).
This seems highly questionable at first. Can you link to or describe an instance where this is done? The issue is that libuv isn't a "leaf" library, but more like a "framework," since it takes over driving the application and its I/O. Unless the loops are running in different threads, I don't see how (practically) one would be able to use two different loops at all, or would want to, etc.
I cannot link to an instance where this is done, I was speculating about the potential for it - but thank you, the "leaf" vs "framework" distinction makes sense to me now. Also makes sense that ocaml-hiredis
can take a Luv
adapter.
This is not entirely related, but would it be possible to take this opportunity to introduce a lwt_unix
package? Since you're going to introduce a dependency on luv, it would be nice if the jsoo users of lwt that don't need an event loop will be able to use it.
It will no longer be possible to swap the Lwt_engine (in a reasonable way).
If Lwt_engine is removed, what's going to happen to libraries like lwt_glib
? Not that I use gtk, but it seems like we'd lose the ability to make Lwt work with custom event loops.
This is not entirely related, but would it be possible to take this opportunity to introduce a
lwt_unix
package? Since you're going to introduce a dependency on luv, it would be nice if the jsoo users of lwt that don't need an event loop will be able to use it.
I'd like to second that, we're using lwt with jsoo right now to maintain codebase that targets both native and js targets.
This is not entirely related, but would it be possible to take this opportunity to introduce a
lwt_unix
package?
Yes, this seems like a good opportunity to do so.
If Lwt_engine is removed, what's going to happen to libraries like
lwt_glib
? Not that I use gtk, but it seems like we'd lose the ability to make Lwt work with custom event loops.
Indeed. There may be other ways to integrate custom event loops with libuv, but lwt_glib
in its current form probably won't work. I understood from @avsm that 0install, the main known user of lwt_glib
, will adapt to the change somehow (cc @talex5).
I understood from @avsm that 0install, the main known user of lwt_glib, will adapt to the change somehow (cc @talex5).
First I've heard of it. Though when lwt_glib
got split off from lwt, Debian took the opportunity to drop the package completely, so 0install is now having to vendor it there, which is a bit painful. Maybe we should just stop using Lwt in the GUI and go back to callbacks?
Now that @ulrikstrid has begun the libuv conversion (#328) in #811, I'd like to outline an overall plan for how to finish the whole process sanely :)
The technical steps are:
✔️ Create a new
Lwt_engine
based on libuv (#811).This allows quickly replacing the
lwt.unix
main loop by libuv's main loop, by telling libuv to polllwt.unix
's fds.This is not how libuv's API is intended to be used (and it does not use the vast majority of the libuv API). However, it allows us to switch to libuv without touching the corresponding vast majority of the code in
Lwt_unix
. We essentially connect two plugin APIs that were meant to be used together: Lwt already allows replacing its polling engine, and libuv offers a polling engine.This polling of
lwt.unix
's fds by libuv becomes a fallback implementation of everything inlwt.unix
, allowing us to do further work piecemeal, yet still have a working library throughout.The libuv-based
Lwt_engine
initially won't support Windows.Reimplement system calls by forwarding directly to libuv.
Like
lwt.unix
, libuv offers an asynchronous version of a large part of the Unix system call API. For example, see the filesystem operations available.So, we will continue by changing e.g.
Lwt_unix.openfile
to calluv_fs_open
.This will bypass the
Lwt_engine
and Lwt's thread pool, and directly use libuv'suv_loop_t
(its "engine") and libuv's thread pool. It will also allow us to delete large amounts of C code from Lwt.We will have to do this work one system call category at a time. libuv exposes different fd-like types in each category, so
Lwt_unix.file_descr
will have to internally become a sum type of these libuv types.To get a smooth transition, we will have to write many tests to discover API quirks. There is already a
lwt.unix
testing issue open for this purpose (#539), and it offers one categorization of thelwt.unix
API. libuv's API is categorized in its documentation table of contents.Replace the Lwt thread pool by the libuv thread pool.
After (2), there should be few system calls left implemented in Lwt that use Lwt's own thread pool. We can then reimplement it over libuv's thread pool with little stress.
We may opt to use libuv's thread pool only internally, and leave the current Lwt thread pool implementation around to satisfy existing users.
Port to Windows.
The reasons for doing this last are:
The highest-quality Windows code in libuv is in its specific APIs (point 2) rather than in the polling engine (1). Since direct calls to libuv (2) will gradually replace explicit polling (1), it seems wasteful to adapt existing Lwt code to libuv polling quirks on Windows after (1), only to then delete and replace the code in (2). It's likely that after (2), many categories of system calls will work on Windows immediately, due to libuv's own Windows support.
In other terms, we would like to bypass relying on libuv's Windows polling because we are not sure about the quality of the implementation (both due to libuv and due to Windows; I think Windows is focused on other styles of API than polling).
Likewise, the libuv thread pool in (3) is portable to both Unix and Windows. It seems best not to first adapt the
lwt.unix
thread pool to quirks of polling on Windows, if we only intend to replace it by libuv's own portable thread pool in (3) anyway.libuv has projects to improve categories of system calls on Windows (at least pipes). If we port to Windows later, these may already have matured in libuv.
Organizationally, I propose to do the work in a branch starting from (2). Once we begin converting to direct calls in (2),
lwt.unix
will have a rigid dependency on libuv. It will no longer be possible to swap theLwt_engine
(in a reasonable way).lwt.unix
also won't support Windows for a while. So this will be a breaking change, and it will also take some time to finish and stabilize.On the other hand, working in a branch allows for easy pinning and cherry-picking.
We can prefix issues or PRs having to do with the branch with [libuv] or [luv] (for the name of the binding).
And last, to summarize some of the benefits:
conf-libev
, or install libev system-wide.Edited 28 October 2020.