aantron / hyper

OCaml Web client, composable with Dream [unannounced]
MIT License
68 stars 4 forks source link

MirageOS compatibility #1

Open dinosaure opened 2 years ago

dinosaure commented 2 years ago

Again 🙂 . However, for the client part of the HTTP protocol, the MirageOS is slightly more complex. Indeed, a connect function which allocates a resource (such as a socket) is an anti-pattern for MirageOS due to one fact: the connect function depends inherently to the implementation. For the MirageOS perspective, an application should not depend on a specific implementation and we mostly want to play with interfaces instead of their implementations (which depends on the target chosen by the user).

This is the reason of the conduit existence and/or the mimic existence. They want to hide the implementation-related connect function into a Conduit.resolve/Mimic.resolve. Then, a ceremony is needed to explain to conduit/mimic which implementation you to use to connect this kind of connection.

In the case of conduit, you express that you want TCP 80 or TLS (TCP 443)) for instance regardless the implementation used then to give to your a Mirage_flow.S - it can be lwt_ssl + lwt.unix, ocaml-tls + lwt.unix or ocaml-tls + mirage-tcpip (or something else). However, we got many troubles to extend conduit to some others protocols such as Git.

mimic wants to replace conduit by the extensible ability depending on what the user want. It's why it exists a Git client with mimic and an HTTP client (with paf-cohttp) with mimic. But for more details about mimic, please read the documentation: https://dinosaure.github.io/mimic/mimic/index.html

More concretely, a MirageOS compatibility expects, at least, one thing, an interface (like what provides hyper) which should let the user to pass a ctx (the conduit context or the mimic context). An high-level explanation about this context is available here: https://github.com/mmaker/ocaml-letsencrypt/blob/c07348604bb94eaab9522fe87e455bb729e5d1d8/src/hTTP_client.ml#L2-L24. Then, if we are able to initiate a connection via mimic: 1) we can provide a default mechanism (as git-unix does for instance) which permits to instantiate a Mirage_flow.S used by the runtime loop then (for h2 or http/af) 2) let the mirage to provide its own context depending on implementation chosen by the user

aantron commented 2 years ago

I'm not fully sure what is necessary here. However, Hyper does not directly rely on any implementation. As you can see from ?server arguments to all the functions

https://github.com/aantron/hyper/blob/d4f41825bd25b0da48cb75f127257204b46ce76a/src/hyper.mli#L20-L24

...the actual implementation can be anything. The default one does HTTP over Unix sockets. However, already within this repo, in the test cases, there is an example of directly passing requests to a server function (since servers are request -> response promise):

https://github.com/aantron/hyper/blob/d4f41825bd25b0da48cb75f127257204b46ce76a/test/expect/http/nohttp/nohttp.ml#L8-L13

This test suite does not do any Unix I/O at all while running requests, it's all in-memory and in-process.

For Mirage compatibility, we would, I guess, need to port the protocol adapters to run in Mirage, and then possibly add some optional arguments or a way of customizing ~server, or an equivalent to that.

dinosaure commented 2 years ago

I'm not fully sure what is necessary here. However, Hyper does not directly rely on any implementation.

It's not about strictly about API but about linking 🙂 - the final question is: can you provide an implementation & an API which does not require the unix module. Currently, hyper relies by linking and by design of its implementation on the unix module. conduit or mimic permits to implement an HTTP client (or a Git client) which does not depend at the link time on the unix module.

aantron commented 2 years ago

Hyper does not rely on linking, it just happens to use it. Otherwise, yes.

dinosaure commented 2 years ago

Yes but this is the main issue for any projects with MirageOS: you must not depends on unix. For an OS perspective, we want to statically link everything without POSIX symbols/libc. By this fact, we along dependencies, none of them should depends on unix.cmxa because we finally don't link with the host's lib C and we don't trust on the POSIX API. More concretely, until you delete the unix dependency (transitively or not), you are not compatible with MirageOS.

Now, the problem becomes more difficult for hyper because we usually want to implement few functions like get, post, etc. For a MirageOS perspective however a question remains: How to implement this function without {Lwt_u,U}nix.connect?

On this problem, the usual solution is to use a functor which should be close to:

module type SOCKET = sig
  type socket

  val read : socket -> bytes Lwt.t
  val write : socket -> bytes -> unit Lwt.t
  val connect : ipaddr -> socket Lwt.t
end

Then, we just need to specialize it for hyper-unix (for example) and functoria/mirage will specialize it with POSIX syscalls or mirage-tcpip (depending on the target chosen by the user). However, as I said, the connect function is an anti-pattern for MirageOS. Even if we can find some solution, you must think about an ultimate connect function which can be implemented by several implementations:

For our experience with conduit, we know that it's a no-go solution. Even if conduit is able to provide a connect function which is free from any syscalls, we know that the library is not really extensible for many reasons (the first one is an historic reason, conduit is really old). The second solution is mimic 🙂 .

May be the best for us is to propose something first and see what happens about the project. But I'm not going to hide from you that the change is likely to be significant.

aantron commented 2 years ago

Hyper is already modularized so that all the code containing the connect function is in a lower-level module, indeed it is called connect.ml at the moment. As already mentioned, Hyper can already run without connect.ml by calling into Dream servers directly. From here, it's only a matter of separating the modules again into packages, and composing hyper from hyper-pure and Unix details, while hyper-mirage will compose a hyper-pure with Mirage details.