mirage / mirage-qubes

Mirage support for writing QubesOS AppVM unikernels
BSD 2-Clause "Simplified" License
62 stars 11 forks source link

Remove internal Cstruct.t #65

Closed palainp closed 4 months ago

palainp commented 5 months ago

Dear devs, This PR wants to remove most of the Cstruct.t uses (the remaining is kept for ocaml-vchan and probably it can be difficult to change the API as it uses io-pages with Xen, but there is some ppx_cstruct that couldn't be changed anyway). So as this was one of the first _un-ppxcstruct-ion for me, there is probably a lot of improvement to be done (I have a lot of Bytes.t / String conversion, and Bytes.blit that might be unwanted copies...) + I still have some missing Cstruct.hexdump-like that would be great to print out (those are limited to gUI.ml but qubes-mirage-firewall, that I use for testing the PR, doesn't use gUI.ml at all). Any review / comments are welcome :) Best,

palainp commented 5 months ago

Indeed a test suite would be great but I'm not sure to be able to write that :(

So far the only tests I've done are based on qubes-mirage-firewall usage and I didn't saw drop in performance (here the interface usage is quite limited and qubes-mirage-firewall bottleneck is more with Xen buffers for net packets than communication from/to using rExec, gUI, od dB). I however saw some bits of improvment because we now don't allocate some Cstruct.t, like:

[2024-03-12 16:51:22] 2024-03-12T15:51:22-00:00: [DEBUG] [qubes.db] "/connected-ips" = "10.137.0.13"
[2024-03-12 16:51:22] malloc(): for 0 @0, new memory usage = 5526936
[2024-03-12 16:51:22] malloc(): for 4 @0xac8fa0, new memory usage = 5526960
[2024-03-12 16:51:22] malloc(): for 8 @0xac8fc0, new memory usage = 5526984
[2024-03-12 16:51:22] malloc(): for 12 @0xac8fe0, new memory usage = 5527008
[2024-03-12 16:51:23] 2024-03-12T15:51:23-00:00: [DEBUG] [qubes.rexec] remote end wants to use protocol version 3, continuing with version 3

becomes:

[2024-03-12 16:47:05] 2024-03-12T15:47:05-00:00: [DEBUG] [qubes.db] "/connected-ips" = "10.137.0.13"
[2024-03-12 16:47:05] malloc(): for 12 @0xabc330, new memory usage = 5540816
[2024-03-12 16:47:05] 2024-03-12T15:47:05-00:00: [DEBUG] [qubes.rexec] remote end wants to use protocol version 3, continuing with version 3

Here I suspect 0B, 4B and 8B to be some msg_header or similar (0B is more probably one of the mutable buffer). And the remaining 12B is, to me, either another ppx_cstruct generated structure, probably elsewhere, or the result of the last Cstruct creation [Cstruct.of_bytes c].

About the cstruct I've kept it for the ocaml-vchan usage and this seems unavoidable as it relies on io-pages which need to be adresses aligned, and thus cannot be an Ocaml item.

palainp commented 5 months ago

And thanks for the comments :)

And a last note is to me, even if there is no memory nor bandwidth improvement, it would help regarding the GC job as Bytes.t seem to be better collected than malloced memory. So I'm only aiming to avoid a drop in performance :)

hannesm commented 5 months ago

Hmm, maybe ohex should in pp if the data is < 16 bytes not emit any newlines, and no addresses and no byte representation?

palainp commented 5 months ago

So with the last commit I now have only Strings and less copies. I'm quite not sure about the int32_le_string_creation (in lib/formats.ml):

let of_int32_le i =
  assert(i < Int32.max_int);
  let i = Int32.to_int i in
  String.init 4 (fun p -> Char.chr ((i lsr (p*8)) land 0xff))

There is probably something better to create strings as 4B from a int32?

About ohex.pp I would love if there is something like an optional parameter (default to not activated) to only prints the bytes values (I like the current separation between 2B packs, but without the addresses, nor the end of lines).

palainp commented 5 months ago

Oh :( String.get_int32_le only exists since 4.13. I'm not sure what's the best move bump lower to 4.13 (this is what I'd do but so far it was 4.06, the step is to big?) / stick with the Bytes / any intermediate solution?

A second point is now I have in lib/formats.ml a lot of getter/setter commented out, maybe it can be ok to remove all those lines?

reynir commented 5 months ago

Oh :( String.get_int32_le only exists since 4.13. I'm not sure what's the best move bump lower to 4.13 (this is what I'd do but so far it was 4.06, the step is to big?) / stick with the Bytes / any intermediate solution?

You can use Bytes.get_int32_le with Bytes.unsafe_of_string as an intermediate solution.

palainp commented 5 months ago

Thank you @reynir , I applied your suggestion in https://github.com/mirage/mirage-qubes/pull/65/commits/930796de415a13fe25039033bf6461e51cec41d2 :)

palainp commented 5 months ago

So far I think it's ready for review. EDIT: The original code used uint32_t with ppx_cstruct and I swapped to Int32.t everywhere. It seems that the manipulated values are fine with that modification (0<x<2^31).

The remaining thing I worried about it the last Cstruct.t bits (interface with ocaml-vchan) and particulary the Cstruct.t creation in lib/msg_chan.ml:66 (let cs = List.map Cstruct.of_string buffers in).

Using small strings in Cstruct.of_string still leads to small Cstruct.t creations. I haven't checked whether it's possible to concat all the strings together and use Vchan_xen.write instead of writev, but I wonder if that sounds plausible to change the ocaml-vchan API to accept Strings and deal with Cstruct.t there.

hannesm commented 4 months ago

I appreciate this work, and would like to spend some time reviewing it. About the ocaml-vchan "move to string" line of work -- this is best done as a separate step since it is quite involved. I suggest to review & merge this first (and cut a release), and once we're cstruct-free in mirage-flow etc., we can adapt the remaining occurences here.

hannesm commented 4 months ago

I took the liberty to cleanup whitespaces, rename String.t to string, and remove Lwt.fail_with usage.

IMHO fine to merge & release.

palainp commented 4 months ago

Thank you for your review and the modifications!