Closed ArtificialAmateur closed 2 years ago
Can you try to run in your compilation VM objdump -D qubes_firewall.xen
and then go down to28bded
?
My faulty address is in the fmt_fp
function (you'll have to scroll up for a while). If it's the same for you, it could mean that floating printf is buggy (fmt_fp
is terrific ! you can see it at https://github.com/mirage/ocaml-freestanding/blob/421bc48d1fd6f5ff04a578cfdd50b2bd1e132dc0/nolibc/vfprintf.c#L205).
I have tried removing floating prints from the unikernel and it seems to work fine (it starts up fine), can you try the same?
$ git diff
diff --git a/memory_pressure.ml b/memory_pressure.ml
index cecf4a9..ea18674 100644
--- a/memory_pressure.ml
+++ b/memory_pressure.ml
@@ -16,10 +16,10 @@ let meminfo stats =
let { OS.Memory.free_words; heap_words; _ } = stats in
let mem_total = heap_words * wordsize_in_bytes in
let mem_free = free_words * wordsize_in_bytes in
- Log.info (fun f -> f "Writing meminfo: free %a / %a (%.2f %%)"
+ (*Log.info (fun f -> f "Writing meminfo: free %a / %a (%.2f %%)"
Fmt.bi_byte_size mem_free
Fmt.bi_byte_size mem_total
- (fraction_free stats *. 100.0));
+ (fraction_free stats *. 100.0));*)
Printf.sprintf "MemTotal: %d kB\n\
MemFree: %d kB\n\
Buffers: 0 kB\n\
diff --git a/unikernel.ml b/unikernel.ml
index 0621e42..2fae65e 100644
--- a/unikernel.ml
+++ b/unikernel.ml
@@ -56,7 +56,7 @@ module Main (R : Mirage_random.S)(Clock : Mirage_clock.MCLOCK) = struct
let time_in_ns = Clock.elapsed_ns () - start_time in
Int64.to_float time_in_ns /. 1e9
in
- Log.info (fun f -> f "QubesDB and qrexec agents connected in %.3f s" startup_time);
+ (*Log.info (fun f -> f "QubesDB and qrexec agents connected in %.3f s" startup_time);*)
(* Watch for shutdown requests from Qubes *)
let shutdown_rq =
OS.Lifecycle.await_shutdown_request () >>= fun (`Poweroff | `Reboot) ->
I can confirm both the error and the fix on Qubes 4.1 with qubes-mirage-firewall (current master@ef2419bf6f49c646b0a5eb3346fc7ec9555fa04f). Good catch @palainp !
@ra-- unfortunately the hardest part of the job will be figuring out why the floating point printf fails now and to fixing the ocaml-freestanding side. I'm currently still on 4.0.4 (EDIT: not 3.x) Qubes OS and can't test how to fix that now :(
Update: So I installed Qubes 4.1 and I now have a really strange observation here. If I try to produce a "minimal bug unikernel", starting with mirage-skeleton/hello_world I can have these results:
$ cat unikernel.ml
open Lwt.Infix
module Hello (Time : Mirage_time.S) = struct
let start _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
let stats = OS.Memory.quick_stat () in
let { OS.Memory.free_words; heap_words; _ } = stats in
let value = free_words in
Logs.info (fun f -> f "printing A %a" Fmt.byte_size value);
Logs.info (fun f -> f "printing B %a" Fmt.bi_byte_size value);
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
end
And in the dom0 console logs:
[2022-02-23 12:34:30] Logfile Opened
[2022-02-23 12:34:31] Solo5: Xen console: port 0x2, ring @0x00000000FEFFF000
[2022-02-23 12:34:31] | ___|
[2022-02-23 12:34:31] __| _ \ | _ \ __ \
[2022-02-23 12:34:31] \__ \ ( | | ( | ) |
[2022-02-23 12:34:31] ____/\___/ _|\___/____/
[2022-02-23 12:34:31] Solo5: Bindings version v0.6.9
[2022-02-23 12:34:31] Solo5: Memory map: 64 MB addressable:
[2022-02-23 12:34:31] Solo5: reserved @ (0x0 - 0xfffff)
[2022-02-23 12:34:31] Solo5: text @ (0x100000 - 0x1f6fff)
[2022-02-23 12:34:31] Solo5: rodata @ (0x1f7000 - 0x231fff)
[2022-02-23 12:34:31] Solo5: data @ (0x232000 - 0x2fbfff)
[2022-02-23 12:34:31] Solo5: heap >= 0x2fc000 < stack < 0x4000000
[2022-02-23 12:34:31] 2022-02-23 17:34:31 -00:00: INF [application] printing A 7.28MB
[2022-02-23 12:34:31] Solo5: trap: type=#MF ec=0x0 rip=0x19b356 rsp=0x3ffdc00 rflags=0x10216
[2022-02-23 12:34:31] Solo5: ABORT: cpu_x86_64.c:181: Fatal trap
So the fatal trap is also hidden into the binary pretty print formatter (which do manipulation with floats)... And what is also troubling me is that:
value
argument by ~any random~ (EDIT: some carefully chosen) constant integer with the Qubes 4.1 xen it runs fine!I have to report that the last build on https://builds.robur.coop/job/qubes-firewall/ is still suffering from this issue. If I don't build it myself, should I just wait for a few days?
Thanks for your reporting !
The PR fixing this issue has been integrated in version 0.8.0 of ocaml-solo5 which is only usable with mirage 4. As mirage 4 has just been released a few days ago, I guess it will take a bit before it can be used in automatic build systems.
Good news, since the most recent build today (https://builds.robur.coop/job/qubes-firewall/build/2b271b28-0cb5-4566-9350-18d6d7c29119/ and later) this issue should be fixed in the binary builds as well @njk2002
Will there be a new release soon? Would love to use it on Qubes 4.1!
Yes, we'll soon have releases, esp. now that lots of outstanding issues have been cleared (PRs and merges pending). Expect a new release within the next two weeks :)
Building from source from the current master HEAD gives the following error when deployed and ran:
Reopening issue #128.