containers / libkrun

A dynamic library providing Virtualization-based process isolation capabilities
Apache License 2.0
889 stars 74 forks source link

[RFC] Add X11 cross domain channel type. #213

Open WhatAmISupposedToPutHere opened 2 months ago

WhatAmISupposedToPutHere commented 2 months ago

This adds the X11 cross domain channel and an ability to share futexes between the vm and the host. Since it requires dax to work, the commits are now on top of @asahilina's branch that enables dax, but it should be possible to rebase it on top of whatever solution will be used for it.

slp commented 2 months ago

This is an interesting approach, indeed. Please consider rebasing it on top of #212 and providing some instructions on how to test it.

WhatAmISupposedToPutHere commented 2 months ago

I have attempted to rebase it on the specified branch, however if i try to enable dax with it, the vm crashes with the following trace:

RUST_BACKTRACE=full krun bash
thread '<unnamed>' panicked at src/devices/src/virtio/fs/worker.rs:161:18:
called `Result::unwrap()` on an `Err` value: QueueReader(FindMemoryRegion)
stack backtrace:
   0:     0xffff7f8462f8 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h677fb81bc4f23286
   1:     0xffff7f6e9ac0 - core::fmt::write::h2b46371187d5b982
   2:     0xffff7f826db8 - std::io::Write::write_fmt::h250845b38108f265
   3:     0xffff7f84a420 - std::sys_common::backtrace::print::hf58378670760a884
   4:     0xffff7f849c60 - std::panicking::default_hook::{{closure}}::h60e1dc99a4ee0343
   5:     0xffff7f84afa0 - std::panicking::rust_panic_with_hook::h0985b2ea17fc89a2
   6:     0xffff7f84a75c - std::panicking::begin_panic_handler::{{closure}}::h4c693bee5468b5eb
   7:     0xffff7f84a6c8 - std::sys_common::backtrace::__rust_end_short_backtrace::hea1747cffd850820
   8:     0xffff7f84a6bc - rust_begin_unwind
   9:     0xffff7f6b1160 - core::panicking::panic_fmt::hd485fe77a335e582
  10:     0xffff7f6b1480 - core::result::unwrap_failed::h3e893060b0a81569
  11:     0xffff7f6fe30c - devices::virtio::fs::worker::FsWorker::handle_event::h809805dafe3be340
  12:     0xffff7f71bba8 - std::sys_common::backtrace::__rust_begin_short_backtrace::he4a566c00aeb29c0
  13:     0xffff7f7766b4 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h7f5b073326d5a5ad
  14:     0xffff7f84ba70 - std::sys::pal::unix::thread::Thread::new::thread_start::h023e30dcbe44c4e1
  15:     0xffff7f4e69d8 - start_thread
  16:     0xffff7f551dcc - thread_start
  17:                0x0 - <unknown>
slp commented 2 months ago

I have attempted to rebase it on the specified branch, however if i try to enable dax with it, the vm crashes with the following trace:

RUST_BACKTRACE=full krun bash
thread '<unnamed>' panicked at src/devices/src/virtio/fs/worker.rs:161:18:
called `Result::unwrap()` on an `Err` value: QueueReader(FindMemoryRegion)
stack backtrace:
   0:     0xffff7f8462f8 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h677fb81bc4f23286
   1:     0xffff7f6e9ac0 - core::fmt::write::h2b46371187d5b982
   2:     0xffff7f826db8 - std::io::Write::write_fmt::h250845b38108f265
   3:     0xffff7f84a420 - std::sys_common::backtrace::print::hf58378670760a884
   4:     0xffff7f849c60 - std::panicking::default_hook::{{closure}}::h60e1dc99a4ee0343
   5:     0xffff7f84afa0 - std::panicking::rust_panic_with_hook::h0985b2ea17fc89a2
   6:     0xffff7f84a75c - std::panicking::begin_panic_handler::{{closure}}::h4c693bee5468b5eb
   7:     0xffff7f84a6c8 - std::sys_common::backtrace::__rust_end_short_backtrace::hea1747cffd850820
   8:     0xffff7f84a6bc - rust_begin_unwind
   9:     0xffff7f6b1160 - core::panicking::panic_fmt::hd485fe77a335e582
  10:     0xffff7f6b1480 - core::result::unwrap_failed::h3e893060b0a81569
  11:     0xffff7f6fe30c - devices::virtio::fs::worker::FsWorker::handle_event::h809805dafe3be340
  12:     0xffff7f71bba8 - std::sys_common::backtrace::__rust_begin_short_backtrace::he4a566c00aeb29c0
  13:     0xffff7f7766b4 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h7f5b073326d5a5ad
  14:     0xffff7f84ba70 - std::sys::pal::unix::thread::Thread::new::thread_start::h023e30dcbe44c4e1
  15:     0xffff7f4e69d8 - start_thread
  16:     0xffff7f551dcc - thread_start
  17:                0x0 - <unknown>

This means the descriptor address, sent by the guest, is pointing to a location that's not registered in GuestMemory. I can't reproduce that here, could you please give me some info about your test system (architecture, OS and amount of RAM)?

WhatAmISupposedToPutHere commented 2 months ago

Arm64 (m1 pro), Linux, 32 gb

slp commented 2 months ago

I suspect virtio-fs will sometimes be asked to operate on the window from the host side, which is something I hadn't anticipated. I'll verify this and, if that's case, I'll update #212 to cover the SHM regions for virtio-fs under the GuestMemory umbrella.

slp commented 2 months ago

@WhatAmISupposedToPutHere I'm still unable to reproduce the issue here. I could implement the changes to cover the range in GuestMemory anyway, but I'd like to be able to test it properly first. Could you please launch krun with RUST_LOG=devices::virtio::fs=debug?

asahilina commented 2 months ago

The M1/M2 Pro/Max series have physical RAM mapped at a higher address and a larger physical and virtual address space size (including for VMs) than baseline M1/M2 chips (42 bits vs 36 bits I think?). @slp if you are testing on the baseline chips then that's probably the difference...

On August 26, 2024 2:14:19 PM GMT+02:00, "Sergio López" @.***> wrote:

@WhatAmISupposedToPutHere I'm still unable to reproduce the issue here. I could implement the changes to cover the range in GuestMemory anyway, but I'd like to be able to test it properly first. Could you please launch krun with RUST_LOG=devices::virtio::fs=debug?

-- Reply to this email directly or view it on GitHub: https://github.com/containers/libkrun/pull/213#issuecomment-2310061716 You are receiving this because you were mentioned.

Message ID: @.***>

WhatAmISupposedToPutHere commented 2 months ago

Link to the log: https://pastebin.com/XMGbr1cG

slp commented 2 months ago

Thanks a lot for the logs. With that info, I was able to reproduce the issue here. As suspected, virtio-fs is being asked to read directly from a mapped region. I'll update #212 to cover those regions under GuestMemory.

slp commented 2 months ago

@WhatAmISupposedToPutHere Could you please try again with the latest version of #212 ?

WhatAmISupposedToPutHere commented 2 months ago

I can confirm that it works with the latest version of dax patches.

The test setup right now is a bit involved, but here are the steps:

  1. Compile and install this branch (duh).
  2. Install krun from this branch https://github.com/WhatAmISupposedToPutHere/krun/tree/frankenbuild
  3. Get this thing, https://github.com/WhatAmISupposedToPutHere/x112virtgpu , build it, and change the binary to be setuid root.
  4. Enter krun (i.e with krun bash).
  5. Start x112virtgpu in the background.
  6. unset WAYLAND_DISPLAY, and export DISPLAY=:0
  7. run glxgears or whatever x11 app you want to test.
  8. Additionally, you can also build preload.so from x112virtgpu repo, and LD_PRELOAD it. It is not strictly required, but it makes things run better.
slp commented 2 months ago

Thanks for the instructions. Got it working here, successfully running some apps like glxgears and Xonotic. Seems to struggle with others like xeyes, firefox or steam. Seems promising nevertheless. Should it be possible to support roughly every x11 app there or are there any intrinsic limitations of this approach?

WhatAmISupposedToPutHere commented 2 months ago

In my testing steam actually worked fine, aside from popup windows showing as black on first draw. Firefox also seems to be fine (i am sending this comment from a copy of firefox running under this thing.) Xeyes does in fact crash and will need further investigation. LD_PRELOAD generally increases compatibility (i.e. Team Fortress 2 and firefox require it.) The thing is pretty WIP for now, and bugs should be expected in x112virtgpu daemon, but this diff should work okay.

This approach should support nearly every x11 app, expect those that hard-require mit-shm or xvideo and do not have fallbacks. Supporting those protocols would require adding new linux syscalls both on the host and the guest os.

EDIT: you are testing it on arm64, right? x112virtgpu only supports that architecture for now. (it does code injecion into a client)

asahilina commented 3 days ago

This can be closed in favor of #232