PCI passthrough not working for HVM domains

esheltone commented 8 years ago

There have been multiple reports that PCI passthrough does not work for HVM domains using the qubes software:

https://groups.google.com/d/msg/qubes-users/cmPRMOkxkdA/gIV68O0-CQAJ (reporting passthrough not working via libvirt, but that passthrough still could be done using Xen xl) https://groups.google.com/d/msg/qubes-users/ExMvykCyYiY/M3nHxweRFAAJ (confirmation by Marek that passthrough was not working on R3) https://groups.google.com/d/msg/qubes-users/ppKj_YWqr94/l2gHv6uJAgAJ

This issue appears to have started with use of the HAL in Qubes R3. PCI passthrough continues to work fine for PV-based Qubes VMs, such as sys-net.

Marek guessed that it could be a qemu issue (see second linked post). However, in the first linked post, PCI passthrough was done to an HVM domain via 'xl' using "device_model_version = 'qemu-xen-traditional'", so this may rule out qemu as the culprit.

marmarek commented 8 years ago

@rootkovska @mfc I've assigned priority "major" but maybe it deserve some higher?

rootkovska commented 8 years ago

... or lower, rather? (as we currently we don't support HVM-based net/usb VMs, so this affects very few users)

esheltone commented 8 years ago

I would say the affected users fall into two main categories: (1) users trying to get GPU passthrough working for their Windows HVMs, to be able to do real 3D graphics applications, etc., and (2) users wanting to pass through a USB controller, so they can use a webcam or other devices with Windows. I would like to be able to do things like pass through a storage adapter and network adapter directly to a FreeBSD HVM, but that is admittedly an even more specialized use case affecting almost no one else.

I think it would be good to clearly document that passthrough is not available for HVM domains and perhaps also remove the capability from Qubes Manager for HVM domains until this issue is resolved.

esheltone commented 8 years ago

Another use case: wanting to be able to have sound output in Windows (currently unavailable by any means): https://groups.google.com/forum/#!topic/qubes-users/BBWF9wguP-0

ideologysec commented 8 years ago

Would be great to have PCI passthrough on a dual-GPU laptop, for gaming purposes in Windows without sacrificing the isolation that Qubes provides. (audio via device passthrough is nice but really a working Qubes Windows Tools audio driver for better interaction with the rest of the system and not specialized applications, would be great).

wphowell commented 8 years ago

There is actually a 3rd category: disk controllers. It isn't possible to rip disks without the raw device being available to the VM.

marmarek commented 8 years ago

Some tests reveals that the problem is in stubdomain - the very same config (xl create directly) with device_model_version = 'qemu-xen-traditional' does work, but with device_model_stubdomain_override=1 does not. Next thing to check: unpatched libxl to rule out our patches breaking it.

marmarek commented 8 years ago

Using libxl (xen packages 4.6.1) from Fedora 24 it does work, even with stubdomain...

tirrorex commented 8 years ago

Though fedora 24 wasn't due until april? When you say it is working, you mean flawless with every device just like kvm?

marmarek commented 8 years ago

Though fedora 24 wasn't due until april?

Yes, I've used packages from rawhide - to have the same version (F23 has Xen 4.5).

When you say it is working, you mean flawless with every device just like kvm?

I've just tried one sample device and it is properly discovered in the VM. With our libxl/stubdom packages it doesn't show at all.

marmarek commented 8 years ago

Made automated test for this issue (annotated with "expected failure" for now). Should ease debugging (for example bisection). Also, I'm unable to reproduce the success with unmodified xen-4.6.0 toolstack (+qemu) compiled manually. Maybe that success was previous because of some Fedora patch. Or it is some race condition (Fedora was running from USB stick, while Qubes from fast SSD disk). Or something totally different...

esheltone commented 8 years ago

The race condition idea seems unlikely, since the problem we are chasing is seen across all systems on Qubes.

There is a surprising number of patches in Fedora for Xen. However, at a glance, the only patches that seem to touch on code that would relate to this kind of problem are the patches for XSAs 154, 164, and 170 - assuming one of these patches is responsible.

jaspertron commented 8 years ago

@marmarek, is this an accurate summary of your testing?

	stubdomain	'qemu-xen-traditional'
Xen 4.6.0 with Qubes patches	broken	working
Xen 4.6.0 (unmodified)	broken	working
Xen 4.6.1 with Fedora patches	working	working

Also, how did you create the xl config file for testing? I tried doing virsh -c xen:/// domxml-to-native xen-xl /etc/libvirt/libxl/my-test-hvm.xml but that gives me

error: Disconnected from xen:/// due to I/O error
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the hypervisor

marmarek commented 8 years ago

I would rather say "Fedora 24" instead of "Xen 4.6.1 with Fedora patches" - this may be broken by some other package than Xen (some library used or so). Additionally, I wasn't able to reproduce the success when running Qubes but launching domain using Fedora 24 binaries (from chroot). Which is another hint it isn't about just toolstack/qemu.

As for config file - something like this. And indeed it seems to be crashing libvirtd... It looks like the bug is triggered by lack of <graphics type='vnc'/> entry (which is intentional on Qubes). Anyway adding it produced some config file. Then, to enable stubdomain you need to add device_model_stubdomain_override=1. And probably set vnc = 0 ;)

I guess the whole problem may have something to do with disabled qemu in dom0 (in addition to stubdomain). This isn't fully consistent with test results, but there may be some other factors.

marmarek commented 8 years ago

Found configs from those tests: https://gist.github.com/marmarek/794305496557cc679fced21e252e05b4 May contain some later changes though...

jaspertron commented 8 years ago

It looks like the bug is triggered by lack of entry (which is intentional on Qubes). Anyway adding it produced some config file.

Thanks, that did the trick. I'm getting the same results as you; it only works without a stubdomain.

Additionally, I wasn't able to reproduce the success when running Qubes but launching domain using Fedora 24 binaries (from chroot). Which is another hint it isn't about just toolstack/qemu.

Could the xen-pciback kernel module be to blame? Does Qubes make any modifications to it?

marmarek commented 8 years ago

Could the xen-pciback kernel module be to blame? Does Qubes make any modifications to it?

No, we don't have any modifications there.

It was working in Qubes R2, but there are a lot of differences:

Xen version (was 4.1) - all the parts: hypervisor, qemu, toolstack
Kernel version (was 3.12, probably irrelevant)
Libvirt usage vs xl directly (this should be excluded by above tests)

Next thing I'd check is qemu in stubdomain - simply get stubdomain binary from R2 and try it on R3.x. It is in /usr/lib/xen/boot/ioemu-stubdom.gz, which is shipped in xen-hvm rpm.

jaspertron commented 8 years ago

Next thing I'd check is qemu in stubdomain - simply get stubdomain binary from R2 and try it on R3.x. It is in /usr/lib/xen/boot/ioemu-stubdom.gz, which is shipped in xen-hvm rpm.

Ok, I replaced /usr/lib/xen/boot/ioemu-stubdom.gz with the ioemu-stubdom.gz from Qubes-R2-x86_64-DVD.iso. Unfortunately it doesn't want to start:

[user@dom0 ~]$ sudo xl create pcihvm.xl 
Parsing config from pcihvm.xl
libxl: error: libxl_dm.c:1671:stubdom_xswait_cb: Stubdom 13 for 12 startup: startup timed out
libxl: error: libxl_create.c:1339:domcreate_devmodel_started: device model did not start: -9
libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block remove [10199] exited with error status 1
libxl: error: libxl_device.c:1084:device_hotplug_child_death_cb: script: /etc/xen/scripts/block failed; error detected.
libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block remove [10197] exited with error status 1
libxl: error: libxl_device.c:1084:device_hotplug_child_death_cb: script: /etc/xen/scripts/block failed; error detected.
libxl: error: libxl.c:1606:libxl__destroy_domid: non-existant domain 12
libxl: error: libxl.c:1564:domain_destroy_callback: unable to destroy guest with domid 12
libxl: error: libxl.c:1491:domain_destroy_cb: destruction of domain 12 failed

Here's some of the output from /var/log/xen/qemu-dm-pcihvm.log:

---snip---
Register xen platform.
Done register platform.
xs_watch(/local/domain/12/log-throttling, /local/domain/12/log-throttling)
platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw state.
qubes_gui/init: 660
qubes_gui/init: 669
qubes_gui/init: 672
qubes_gui/init: 681
xs_daemon_open -> 9, 0x1609f8
evtchn_open() -> 10
xc_evtchn_bind_unbound_port(0) = 0
xs_write(device/vchan/6000/ring-ref): EACCES
close(10)
libvchan_server_init: 
close(0)
GPF rip: 0xfc514, error_code=0
Thread: main
RIP: e030:[<00000000000fc514>] 
RSP: e02b:00000000005ef8a8  EFLAGS: 00010202
RAX: 2f302f6e69616d6f RBX: 0000002002c087c0 RCX: 0000000000001055
RDX: 000000000000000a RSI: 00000000005ef798 RDI: 2f302f6e69616d6f
RBP: 00000000005ef8a8 R08: 000000000000000a R09: 0000000000576000
R10: 000000000000104b R11: 0000000000000ffa R12: 0000000000000000
R13: 00000000001635f0 R14: 0000000000000000 R15: 0000000000163558
base is 0x5ef8a8 caller is 0xe2a78
base is 0x5ef908 caller is 0xe2635
base is 0x5ef918 caller is 0xddc61
base is 0x5ef938 caller is 0x1047ed
base is 0x5ef958 caller is 0xfad7d
base is 0x5ef968 caller is 0xf4510
base is 0x5ef998 caller is 0xf45ac
base is 0x5ef9b8 caller is 0xf6379
base is 0x5efa08 caller is 0xf4b2e
base is 0x5efa18 caller is 0xf4474
base is 0x5efa38 caller is 0x24790
base is 0x5efa58 caller is 0x24002
base is 0x5efa78 caller is 0x8ea6
base is 0x5efe08 caller is 0xd7129
base is 0x5effe8 caller is 0x33da

5ef890: a8 f8 5e 00 00 00 00 00 2b e0 00 00 00 00 00 00
5ef8a0: 01 00 00 00 00 00 00 00 08 f9 5e 00 00 00 00 00
5ef8b0: 78 2a 0e 00 00 00 00 00 01 00 00 00 00 00 00 00
5ef8c0: 1a 00 00 00 00 00 00 00 f8 f8 5e 00 00 00 00 00

5ef890: a8 f8 5e 00 00 00 00 00 2b e0 00 00 00 00 00 00
5ef8a0: 01 00 00 00 00 00 00 00 08 f9 5e 00 00 00 00 00
5ef8b0: 78 2a 0e 00 00 00 00 00 01 00 00 00 00 00 00 00
5ef8c0: 1a 00 00 00 00 00 00 00 f8 f8 5e 00 00 00 00 00

fc500: ca 48 85 f2 74 ea eb 0c 0f 1f 84 00 00 00 00 00
fc510: 48 83 c0 01 80 38 00 75 f7 48 29 f8 5d c3 66 90
fc520: 55 48 89 f8 48 89 f9 a8 07 48 89 e5 75 56 48 8b
fc530: 0f 49 ba ff fe fe fe fe fe fe fe 49 89 c8 4c 01

Any ideas?

marmarek commented 8 years ago

vchan library is different in R2 than in R3.x... You can try to fake the old one, execute:

xenstore-write /local/domain/`xl domid pcihvm-dm`/device/vchan ''
xenstore-chmod /local/domain/`xl domid pcihvm-dm`/device/vchan n`xl domid pcihvm-dm`

But you need to be very fast with this, as you need to make it before stubdom reach this GUI initialization. Maybe xl create -p will help, but AFAIR it only keep the target domain paused, not stubdomain.

jaspertron commented 8 years ago

You can try to fake the old one, execute:

xenstore-write /local/domain/`xl domid pcihvm-dm`/device/vchan ''
xenstore-chmod /local/domain/`xl domid pcihvm-dm`/device/vchan n`xl domid pcihvm-dm`

That seems to help it go a little further before running into a different error:

---snip---
Register xen platform.
Done register platform.
xs_watch(/local/domain/42/log-throttling, /local/domain/42/log-throttling)
platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw state.
qubes_gui/init: 660
qubes_gui/init: 669
qubes_gui/init: 672
qubes_gui/init: 681
xs_daemon_open -> 7, 0x1606d8
evtchn_open() -> 8
xc_evtchn_bind_unbound_port(0) = 0
qubes gui initialized
resize to 640x480@32, 2560 required
xs_write(/local/domain/0/device-model/42/state): EACCES
error recording dm 
xs_read_watch() -> /local/domain/42/log-throttling /local/domain/42/log-throttling
xs_read(/local/domain/42/log-throttling): ENOENT
xs_read(/local/domain/42/log-throttling): read error
qemu: ignoring not-understood drive `/local/domain/42/log-throttling'
medium change watch on `/local/domain/42/log-throttling' - unknown device, ignored
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
resize to 720x400@32, 2880 required
xs_read_watch() -> /local/domain/42/cpu vcpu-set
vcpu-set: watch node error.
[xenstore_process_vcpu_set_event]: /local/domain/42/cpu has no CPU!
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
xs_read_watch() -> /local/domain/0/device-model/42/command dm-command
xs_read(/local/domain/0/device-model/42/command): EACCES
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
xs_read_watch() -> /local/domain/0/device-model/42/logdirty/cmd logdirty
xs_read(/local/domain/0/device-model/42/logdirty/cmd): EACCES
Log-dirty: no command yet.
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
xs_read_watch() -> /local/domain/10/backend/vbd/43/51760/params xvdd
Using xvdd for guest's hdd
medium change watch on `xvdd' (index: 0): /home/user/Downloads/Fedora-Live-Workstation-x86_64-23-10.iso
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0

marmarek commented 8 years ago

Are you sure about path? Is xen-blkback module loaded in domain holding this image?

jaspertron commented 8 years ago

Are you sure about path? Is xen-blkback module loaded in domain holding this image?

Yes to both, and it works with the original (R3.1) stubdomain binary:

---snip---
Register xen platform.
Done register platform.
xs_watch(/local/domain/21/log-throttling, /local/domain/21/log-throttling)
platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw state.
qubes_gui/init: 657
qubes_gui/init: 666
qubes_gui/init: 669
qubes_gui/init: 678
evtchn_open() -> 8
xc_evtchn_bind_unbound_port(0) = 0
xs_daemon_open -> 9, 0x15cec8
qubes_gui/init[708]: version sent, waiting for xorg conf
qubes gui initialized
resize to 640x480@32, 2560 required
xs_read_watch() -> /local/domain/21/log-throttling /local/domain/21/log-throttling
xs_read(/local/domain/21/log-throttling): ENOENT
xs_read(/local/domain/21/log-throttling): read error
qemu: ignoring not-understood drive `/local/domain/21/log-throttling'
medium change watch on `/local/domain/21/log-throttling' - unknown device, ignored
resize to 720x400@32, 2880 required
xs_read_watch() -> /local/domain/21/cpu vcpu-set
vcpu-set: watch node error.
[xenstore_process_vcpu_set_event]: /local/domain/21/cpu has no CPU!
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
xs_read_watch() -> device-model/21/command dm-command
xs_read(device-model/21/command): ENOENT
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
xs_read_watch() -> device-model/21/logdirty/cmd logdirty
xs_read(device-model/21/logdirty/cmd): ENOENT
Log-dirty: no command yet.
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
xs_read_watch() -> /local/domain/6/backend/vbd/22/51760/params xvdd
Using xvdd for guest's hdd
medium change watch on `xvdd' (index: 0): /home/user/Downloads/Fedora-Live-Workstation-x86_64-23-10.iso
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
vga s->lfb_addr = f0000000 s->lfb_end = f1000000 
platform_fixed_ioport: changed ro/rw state of ROM memory area. now is rw state.
platform_fixed_ioport: changed ro/rw state of ROM memory area. now is ro state.
mapping vram to f0000000 - f1000000
resize to 640x480@32, 2560 required
resize to 720x400@32, 2880 required
Unknown PV product 3 loaded in guest
PV driver build 1
vga s->lfb_addr = f0000000 s->lfb_end = f1000000 
vga s->lfb_addr = f0000000 s->lfb_end = f1000000 
vga s->lfb_addr = f0000000 s->lfb_end = f1000000 
vga s->lfb_addr = f0000000 s->lfb_end = f1000000 
vga s->lfb_addr = f0000000 s->lfb_end = f1000000 
vga s->lfb_addr = f0000000 s->lfb_end = f1000000 
vga s->lfb_addr = f0000000 s->lfb_end = f1000000 
resize to 1024x768@32, 4096 required
xs_daemon_open -> 10, 0x15cee8
qubes_gui/init[719]: got xorg conf, creating window
qubes_gui/init: 726
dumping mfns: n=768, w=1024, h=768, bpp=32
configure msg, x/y 1682 18 (was 0 0), w/h 236 1041

marmarek commented 8 years ago

Hmm, where is actual error? I see I/O request not ready in both working and not-working setups... Maybe the real problem is:

xs_write(/local/domain/0/device-model/42/state): EACCES
error recording dm

? This looks strange, as the entry should be inside stubdomain directory. Maybe this have changed from Xen 4.1 time... You can emulate proper behavior with:

xenstore-write /local/domain/`xl domid pcihvm-dm`/device-model/`xl domid pcihvm`/state running

(wait a sec or two with this, to make sure stubdomain really have started, or simply watch its log)

jaspertron commented 8 years ago

You can emulate proper behavior with:

xenstore-write /local/domain/`xl domid pcihvm-dm`/device-model/`xl domid pcihvm`/state running

That did the trick! The vm gets created without any errors.

However, it won't let me connect to the GUI agent. When I do qubes-guid -dxl domid pcihvm-dm-N pcihvm, it times out.

marmarek commented 8 years ago

As vchan library is different, it will not work. You may try with xl console pcihvm - if that system expose login prompt at Xen console. If not, try booting something which will allow you to access through network. You'll probably need to force static IP address, or launch DHCP server in directly connected ProxyVM/NetVM.

marmarek commented 8 years ago

Also, you may take a look at automated test for this: https://github.com/QubesOS/qubes-core-admin/blob/master/tests/hardware.py https://github.com/QubesOS/qubes-core-admin/blob/master/tests/__init__.py#L511-L580

It creates small initramfs (using dracut) with the only purpose to dump lspci output to private.img. Then setup it to really start there (install grub etc).

You may even try to launch this test, using:

python -m qubes.tests.run hardware/TC_00_HVM/test_000_pci_passthrough_presence

Documentation: https://www.qubes-os.org/doc/automated-tests/ And a warning repeated here: Integration tests are written with assumption to be called on dedicated hardware. Do not run those test on machine where you have important data, you can loose it. Especially all the VMs with name starting with test- are removed.

jaspertron commented 8 years ago

You may even try to launch this test, using:
python -m qubes.tests.run hardware/TC_00_HVM/test_000_pci_passthrough_presence

When I run this test I get an error after about 30 seconds:

ERROR (libvirtError: internal error: libxenlight failed to create new domain 'test-inst-vm1')

======================================================================
ERROR: hardware/TC_00_HVM/test_000_pci_passthrough_presence
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/qubes/tests/hardware.py", line 59, in test_000_pci_passthrough_presence
    self.vm.start()
  File "/usr/lib64/python2.7/site-packages/qubes/modules/01QubesHVm.py", line 326, in start
    return super(QubesHVm, self).start(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/qubes/modules/000QubesVm.py", line 1901, in start
    self.libvirt_domain.createWithFlags(libvirt.VIR_DOMAIN_START_PAUSED)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1059, in createWithFlags
    if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
libvirtError: internal error: libxenlight failed to create new domain 'test-inst-vm1'

----------------------------------------------------------------------
Ran 1 test in 31.973s

FAILED (errors=1)

And here's what gets written to /var/log/libvirt/libxl/libxl-driver.log:

2016-06-02 01:17:19 CDT xc: error: panic: xc_dom_core.c:207: failed to open file: No such file or directory: Internal error
2016-06-02 01:17:19 CDT libxl: error: libxl_dom.c:644:libxl__build_pv: xc_dom_kernel_file failed: No such file or directory
2016-06-02 01:17:19 CDT libxl: error: libxl_dm.c:1639:stubdom_pvqemu_cb: error connecting nics devices: No such file or directory
2016-06-02 01:17:19 CDT libxl: error: libxl_create.c:1339:domcreate_devmodel_started: device model did not start: -3
2016-06-02 01:17:19 CDT libxl: error: libxl_dm.c:2031:libxl__destroy_device_model: xs_rm failed for /local/domain/0/device-model/5
2016-06-02 01:17:19 CDT libxl: error: libxl_dm.c:1983:kill_device_model: unable to find device model pid in /local/domain/5/image/device-model-pid
2016-06-02 01:17:19 CDT libxl: error: libxl.c:1643:libxl__destroy_domid: libxl__destroy_device_model failed for 5

marmarek commented 8 years ago

xc_dom_kernel_file failed: No such file or directory? In HVM? Anyway, you can add --do-not-clean option to leave all the files present and try to start it manually. Or even get only root.img from there and use it in your previous test VM. Script there is supposed to write lspci output directly to private.img (without any filesystem).

jaspertron commented 8 years ago

Oops, ignore that last error. I had /usr/lib/xen/boot/ioemu-stubdom.gz symlinked to a non-existent file.

Using root.img and private.img from that Qubes test with my previous test VM:

[user@dom0 ~]$ sudo xl create pcihvm.xl 
Parsing config from pcihvm.xl
libxl: error: libxl_pci.c:1041:libxl__device_pci_reset: The kernel doesn't support reset from sysfs for PCI device 0000:01:00.0
--- waits here for about 60 seconds ---
libxl: error: libxl_exec.c:227:libxl__xenstore_child_wait_deprecated: Device Model not ready
libxl: error: libxl_pci.c:879:qemu_pci_add_xenstore: qemu refused to add device: 0000:01:00.0,msitranslate=0,power_mgmt=0
libxl: error: libxl_create.c:1422:domcreate_attach_pci: libxl_device_pci_add failed: -3
libxl: error: libxl_pci.c:1041:libxl__device_pci_reset: The kernel doesn't support reset from sysfs for PCI device 0000:01:00.0
libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block remove [31290] exited with error status 1
libxl: error: libxl_device.c:1084:device_hotplug_child_death_cb: script: /etc/xen/scripts/block failed; error detected.
libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: /etc/xen/scripts/block remove [31288] exited with error status 1
libxl: error: libxl_device.c:1084:device_hotplug_child_death_cb: script: /etc/xen/scripts/block failed; error detected.
libxl: error: libxl.c:1606:libxl__destroy_domid: non-existant domain 33
libxl: error: libxl.c:1564:domain_destroy_callback: unable to destroy guest with domid 33
libxl: error: libxl.c:1491:domain_destroy_cb: destruction of domain 33 failed

/var/log/xen/qemu-dm-pcihvm.log:

---snip---
qubes gui initialized
resize to 640x480@32, 2560 required
pcifront_watches: waiting for backend to get into the right state /local/domain/0/backend/
pci/34/0
xs_write(/local/domain/0/device-model/33/state): EACCES
error recording dm 
xs_read_watch() -> /local/domain/33/log-throttling /local/domain/33/log-throttling
xs_read(/local/domain/33/log-throttling): ENOENT
xs_read(/local/domain/33/log-throttling): read error
qemu: ignoring not-understood drive `/local/domain/33/log-throttling'
medium change watch on `/local/domain/33/log-throttling' - unknown device, ignored
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
resize to 720x400@32, 2880 required
xs_read_watch() -> /local/domain/33/cpu vcpu-set
vcpu-set: watch node error.
[xenstore_process_vcpu_set_event]: /local/domain/33/cpu has no CPU!
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
xs_read_watch() -> /local/domain/0/device-model/33/command dm-command
******************* PCIFRONT for device/pci/0 **********

xs_read(/local/domain/0/device-model/33/command): EACCES
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
xs_read_watch() -> /local/domain/0/device-model/33/logdirty/cmd logdirty
xs_read(/local/domain/0/device-model/33/logdirty/cmd): EACCES
Log-dirty: no command yet.
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
I/O request not ready: 0, ptr: 0, port: 0, data: 0, count: 0, size: 0
backend at /local/domain/0/backend/pci/34/0
**************************
pcifront_watches: waiting for backend events /local/domain/0/backend/pci/34/0/state
--- waits here for about 60 seconds ---
pcifront_watches: backend state changed: /local/domain/0/backend/pci/34/0/state 7
pcifront_watches: writing device/pci/0/state 7
pcifront_watches: backend state changed: /local/domain/0/backend/pci/34/0/state 8
pcifront_watches: writing device/pci/0/state 4
pcifront_watches: changing state to 4
pcifront_watches: backend state changed: /local/domain/0/backend/pci/34/0/state 4

marmarek commented 8 years ago

Have you written that xenstore entries needed before? Also check xenstore for entries like command or dm-command - if there is some, it means stubdomain didn't handled such command (entry should be removed as soon as command is handled) - probably just another place where xenstore path have changed.

jaspertron commented 8 years ago

Also check xenstore for entries like command or dm-command

Yes, it seems the the commands are getting written to /local/domain/$stubdomid/... instead of /local/domain/0/.... After I move them, it gives me this:

---snip---
backend at /local/domain/0/backend/pci/55/0
**************************
pcifront_watches: waiting for backend events /local/domain/0/backend/pci/55/0/state
xs_read_watch() -> /local/domain/0/device-model/54/command dm-command
xs_read(/local/domain/0/device-model/54/command): EACCES
xs_read_watch() -> /local/domain/0/device-model/54/command dm-command
dm-command: hot insert pass-through pci dev 
register_real_device: Assigning real physical device 01:00.0 ...
pt_libpci_fixup: Error: Can't open /sys/bus/pci/devices/0000:01:00.0/resource: I/O error
register_real_device: Disable MSI translation via per device option
register_real_device: Disable power management
warning: pt_iomul not supported in stubdom 01:00.0
pt_register_regions: IO region registered (size=0x10000000 base_addr=0xe0000000)
pt_register_regions: IO region registered (size=0x00800000 base_addr=0xf0000000)
pt_register_regions: IO region registered (size=0x00000100 base_addr=0x0000e000)
pt_register_regions: IO region registered (size=0x00040000 base_addr=0xf7d00000)
ERROR: PCI region size must be pow2 type=0x8, size=0xf7d40000
close(0)
GPF rip: 0xfc514, error_code=0
Thread: main
---snip---

jaspertron commented 8 years ago

Hmm... Using a different PCI device, I get different results:

---snip---
backend at /local/domain/0/backend/pci/57/0
**************************
pcifront_watches: waiting for backend events /local/domain/0/backend/pci/57/0/state
xs_read_watch() -> /local/domain/0/device-model/56/command dm-command
xs_read(/local/domain/0/device-model/56/command): EACCES
xs_read_watch() -> /local/domain/0/device-model/56/command dm-command
dm-command: hot insert pass-through pci dev 
register_real_device: Assigning real physical device 01:00.1 ...
pt_libpci_fixup: Error: Can't open /sys/bus/pci/devices/0000:01:00.1/resource: I/O error
register_real_device: Disable MSI translation via per device option
register_real_device: Disable power management
warning: pt_iomul not supported in stubdom 01:00.1
pt_register_regions: IO region registered (size=0x00004000 base_addr=0xf7d60000)
pci_intx: intx=2
register_real_device: Error: Binding of interrupt failed! rc=-1
register_real_device: Real physical device 01:00.1 registered successfuly!
IRQ type = INTx
xs_read_watch() -> /local/domain/0/device-model/56/command dm-command
xs_read(/local/domain/0/device-model/56/command): EACCES

marmarek commented 8 years ago

But in the end you've got Real physical device 01:00.1 registered successfuly!. Does it mean the device really appeared in the VM?

jaspertron commented 8 years ago

Well, xl create still showed the same errors, and the VM didn't print anything to private.img.

jaspertron commented 8 years ago

After fiddling with xenstore entries bit more (xenstore-watch was very helpful here), it finally works. It successfully writes the attached pci device information to private.img.

I suppose one of the stubdomain patches is causing this bug, then?

marmarek commented 8 years ago

Just to make sure: does this pci devices list contain the device you've attached to the VM?

It can be caused by change in stubdomain code between xen-4.1 and 4.6. Since it is a fork of actual qemu, without heavy development, there shouldn't be much changes. Indeed between tags xen-4.1.6.1 and xen-4.6.0 there is "only" 93 commits in qemu repository, and 119 in mini-os. Can be bisected. I've created the automatic test exactly for this purpose, but first I needed to know what to bisect :)

So lets summarize:

Xen 4.6.0 with its original stubdomain and no qemu in dom0, on Qubes 3.1 - does not work
Xen 4.6.0 with stubdomain from Xen 4.1 (Qubes 2), no qemu in dom0, on Qubes 3.1 - does work

Is that correct?

jaspertron commented 8 years ago

Just to make sure: does this pci devices list contain the device you've attached to the VM?

Yes :)

[user@dom0 ~]$ sudo lspci -n -s 01:00.1
01:00.1 0403: 1002:aac8

[user@dom0 ~]$ cat /var/lib/qubes/appvms/pcihvm/private.img
00:00.0 0600: 8086:1237 (rev 02)
00:01.0 0601: 8086:7000
00:01.1 0101: 8086:7010
00:01.2 0c03: 8086:7020 (rev 01)
00:01.3 0680: 8086:7113 (rev 01)
00:02.0 0300: 1234:1111
00:03.0 ff80: 5853:0001 (rev 01)
00:04.0 0403: 1002:aac8

So lets summarize:

Xen 4.6.0 with its original stubdomain and no qemu in dom0, on Qubes 3.1 - does not work

Xen 4.6.0 with stubdomain from Xen 4.1 (Qubes 2), no qemu in dom0, on Qubes 3.1 - does work

Is that correct?

Yes, except I'm not sure about "no qemu in dom0". I thought that qemu runs either in dom0 or in a stubdomain. Does it make sense for it to be running in both?

I have device_model_stubdomain_override = 1 in the xl config file. And test results change depending on which ioemu-stubdom.gz I use.

But after checking just now, I see qemu-dm running in dom0 (whenever pcihvm is running):

[user@dom0 ~]$ ps -ef | grep -i qemu
root     12483     1  0 18:12 ?        00:00:00 /usr/lib/xen/bin/qemu-dm -d 9 -domain-name pcihvm-dm -vnc none -nographic -M xenpv

marmarek commented 8 years ago

Yes, except I'm not sure about "no qemu in dom0". I thought that qemu runs either in dom0 or in a stubdomain. Does it make sense for it to be running in both?

Yes, by default it is the case - for exposing VNC for example. And which in practice undermine most of the security benefits of having qemu isolated in stubdomain...

We have patched libxl to disable this specifically, but it is only disabled when you really don't use any of features requiring qemu in dom0. Make sure you do not have enabled:

VNC
serial console
qdisk backend (vhd, qcow2 etc)

I have device_model_stubdomain_override = 1 in the xl config file. And test results change depending on which ioemu-stubdom.gz I use.

So, lets hope qemu in dom0 doesn't matter here... But of course it would be better if you could check without it. If the above way of settings in domain config doesn't work, you could try a nasty hack of replacing its binary with a simple script which just write needed stuff into xenstore (if any).

jaspertron commented 8 years ago

I had serial="none" in the xl config file. When I comment that line out, qemu-dm doesn't get run in dom0. :)

marmarek commented 8 years ago

Does it change test results?

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

jaspertron commented 8 years ago

Does it change test results?

No, it still works. I used a fresh private.img to be sure.

jaspertron commented 8 years ago

In case this test (or similar) needs to be run in the future, here's the script I used to fix the xenstore entries to work with the old stubdomain binary.

#!/bin/bash
set -e

domname='pcihvm'
until
  hvmid=`xl domid $domname 2>/dev/null`
  stubdomid=`xl domid $domname-dm 2>/dev/null`
do
  sleep 1
  echo -n .
done

echo creating xenstore entries for stubdomid $stubdomid and hvmid $hvmid
set -x
xenstore-write /local/domain/$stubdomid/device/vchan '' 
xenstore-chmod /local/domain/$stubdomid/device/vchan n$stubdomid 
xenstore-write /local/domain/$stubdomid/device-model/$hvmid/state running
xenstore-write /local/domain/0/device-model/$hvmid/state dummy
xenstore-chmod /local/domain/0/device-model/$hvmid/state w$stubdomid
set +x

function xenstore-mv(){
SOURCE=$1
DEST=$2

xenstore-read $SOURCE \
  | xargs -I{} sh -evc "\
xenstore-rm $SOURCE
xenstore-write $DEST {}
xenstore-chmod $DEST r$stubdomid" 
}

pstub=/local/domain/$stubdomid/device-model/$hvmid
pdom0=/local/domain/0/device-model/$hvmid

sleep 3
xenstore-mv {$pstub,$pdom0}/parameter
xenstore-mv {$pstub,$pdom0}/command
sleep 9
xenstore-mv {$pdom0,$pstub}/parameter
xenstore-mv {$pdom0,$pstub}/state

marmarek commented 8 years ago

Major progress: it turns out to be broken by this commit: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=c428c9f162895cb3473fab26d23ffbf41a6f293d;hp=dcccaf806a92eabb95929a67c344ac1e9ead6257

So, not qemu or mini-os (at least not directly). The problem is that stubdomain isn't allowed to call xc_domain_getinfo, so the whole xc_domain_memory_mapping fails, which prevents PCI device from working. Reverting this patch fixes the problem, but probably may have unwanted consequences for non-stubdomain use cases (at least if someone want to use iomem domain config, which isn't easy itself in Qubes because of libvirt usage).

So, in Qubes case, probably it would be safe to simply revert this commit. But for more generic use case some better solution needs to be developed.

WetwareLabs commented 8 years ago

Hi,

I can confirm that by reverting the commit above (actually just removing the call to xc_domain_getinfo in xc_domain.c), I can create win 8.1 VM with _device_model_version = 'qemuxen' and pass through USB controller and Radeon 6950. So I guess this fixes the previous passthrough problem with upstream qemu. Now using Qubes 3.1 with Xen 4.6.1

However I'm not sure if qemu is still running in dom0 or stubdomain. I've set "vnc=0", commented "serial=pty" and changed the disk backend to phy (that's the backend that VMs created with VM Manager use, correct?). Ps still shows this:

10346 ?        SLsl   1:00 /usr/lib/xen/bin/qemu-system-i386 -xen-domid 26 -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-26,server,nowait -no-shutdown -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-26,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -name win8 -vnc none -display none -k en-us -device cirrus-vga,vgamem_mb=8 -boot order=d -usb -usbdevice tablet -smp 2,maxcpus=2 -device rtl8139,id=nic0,netdev=net0,mac=00:16:3e:5e:6d:11 -netdev type=tap,id=net0,ifname=vif26.0-emu,script=no,downscript=no -machine xenfv -m 4088 -drive file=/mnt/vms2/win8.1.img,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/mnt/vms2/win-user-inst.img,if=ide,index=1,media=disk,format=raw,cache=writeback -drive file=/mnt/vms2/win8.1.iso,if=ide,index=2,readonly=on,media=cdrom,format=raw,cache=writeback,id=ide-5632

Maybe the disk backend is still not configured correctly? This is my current VM configuration

builder = "hvm"

# Guest name
name = "win8"

# Enable Microsoft Hyper-V compatibile paravirtualisation /
# enlightenment interfaces. Turning this on can improve Windows guest
# performance and is therefore recommended
viridian = 1

# Initial memory allocation (MB)
#memory = 1024
memory = 4096

# Number of VCPUS
vcpus = 2

# PAE is only required on 32-bit
pae = 1
hpet = 1
acpi = 1
apic = 1
on_xend_stop = 'shutdown'
on_poweroff = 'destroy'
on_reboot = 'restart'
on_crash = 'restart'

# Network devices
# A list of 'vifspec' entries as described in
# docs/misc/xl-network-configuration.markdown
vif = [ 'mac=00:16:3e:5e:6d:11,ip=10.137.2.50,script=vif-route-qubes,backend=sys-firewall' ]

##
# Disk Devices
# A list of `diskspec' entries as described in
# docs/misc/xl-disk-configuration.txt
# disk = [ '/dev/vg/guest-volume,raw,xvda,rw' ]
##
disk = [ '/mnt/vms2/win8.1.img,raw,hda,rw,backendtype=phy' ,'/mnt/vms2/win-user-inst.img,raw,hdb,rw,backendtype=phy' , '/mnt/vms2/win8.1.iso,raw,hdc,devtype=cdrom,r,backendtype=phy' ]
#device_model_version = 'qemu-xen-traditional'
device_model_version = 'qemu-xen'
#boot = 'dc'
boot = 'd'

##
# Guest VGA console configuration, either SDL or VNC
#
#sdl = 1
#opengl = 1
##
sdl = 0
vnc = 0
vncunused = 1           # <= VNC display setup searches for free TCP port
vnclisten = '127.0.0.1'
keymap = 'en-us'
stdvga = 0

# 'videoram=16' allows resolutions up-to 1280x1024 -- and if 'stdvga=1' + 'videoram=16' then u can do up-to 2048x1536 @ 32bpp
#videoram = 16

#serial = 'pty'
tsc_mode = 'default'
usb = 1
usbdevice = 'tablet'

# it's not necessary to have the passed-thru GPU as primary within DomU, so:
gfx_passthru = 0
pci = [ '03:00.0', '03:00.1', '00:1a.0,rdm_policy=relaxed' ]

localtime = 1
xen_platform_pci = 1
pci_power_mgmt = 1
pci_msitranslate = 0
monitor = 1
hpet = 1

WetwareLabs commented 8 years ago

Okay, so it seems that stubdom is available only on qemu-xen-traditional, right? (enforcing "device_model_stubdomain_override=1" shows error message about this requirement).

I tested the qemu-xen-traditional with stubdomain override, but I cannot start the VM with PCI devices passed through.: libxl_create.c: 1422: domcreate_attach_pci: libxl_device_pci_add_failed: -16 even with the above mentioned commit disabled.

Peppering libxl with debug msgs I can see it fails somewhere here: libxl__device_pci_add_xenstore(...) --> libxl__get_domain_configuration(...) I'll dig deeper into this later unless you already have some idea what's wrong.

marmarek commented 8 years ago

Check kernel messages - it may be some error returned by xen-pciback module.

marmarek commented 8 years ago

Okay, so it seems that stubdom is available only on qemu-xen-traditional, right?

Yes.

WetwareLabs commented 8 years ago

Little progress: I can pass through single usb controller, but GPU fails when libxl is trying to add the second function (audio) device (03:00.1):

libxl: debug: libxl_pci.c:1135:libxl__device_pci_add: wetware --- add device 0000:03:00.0
libxl: debug: libxl_pci.c:1192:libxl__device_pci_add: wetware -- add to stubdom #28
libxl: debug: libxl_pci.c:921:do_pci_add: wetware do_pci_add PCI device 0000:03:00.0
libxl: debug: libxl_pci.c:958:do_pci_add: wetware #2
libxl: debug: libxl_pci.c:994:do_pci_add: wetware #3
libxl: debug: libxl_pci.c:1018:do_pci_add: wetware #4
libxl: debug: libxl_pci.c:1044:do_pci_add: wetware #5
libxl: debug: libxl_pci.c:135:libxl__device_pci_add_xenstore: wetware #1
libxl: debug: libxl_pci.c:144:libxl__device_pci_add_xenstore: wetware #2: num_devs: (null)
libxl: debug: libxl_pci.c:147:libxl__device_pci_add_xenstore: wetware create_pci_backed
libxl: debug: libxl_pci.c:95:libxl__create_pci_backend: Creating pci backend
libxl: debug: libxl_pci.c:1049:do_pci_add: wetware #6: 0
libxl: debug: libxl_pci.c:1202:libxl__device_pci_add: wetware #2
libxl: debug: libxl_pci.c:1222:libxl__device_pci_add: wetware -- pfunc_mask 1, vfunc_mask 1, orig_vdev 0
libxl: debug: libxl_pci.c:921:do_pci_add: wetware do_pci_add PCI device 0000:03:00.0
libxl: debug: libxl_pci.c:958:do_pci_add: wetware #2
libxl: debug: libxl_pci.c:994:do_pci_add: wetware #3
libxl: debug: libxl_pci.c:1018:do_pci_add: wetware #4
libxl: debug: libxl_pci.c:1044:do_pci_add: wetware #5
libxl: debug: libxl_pci.c:1049:do_pci_add: wetware #6: 0
libxl: debug: libxl_pci.c:1135:libxl__device_pci_add: wetware --- add device 0000:03:00.1
libxl: debug: libxl_pci.c:1192:libxl__device_pci_add: wetware -- add to stubdom #28
libxl: debug: libxl_pci.c:921:do_pci_add: wetware do_pci_add PCI device 0000:03:00.1
libxl: debug: libxl_pci.c:958:do_pci_add: wetware #2
libxl: debug: libxl_pci.c:994:do_pci_add: wetware #3
libxl: debug: libxl_pci.c:1018:do_pci_add: wetware #4
libxl: debug: libxl_pci.c:1044:do_pci_add: wetware #5
libxl: debug: libxl_pci.c:135:libxl__device_pci_add_xenstore: wetware #1
libxl: debug: libxl_pci.c:144:libxl__device_pci_add_xenstore: wetware #2: num_devs: 1
libxl: debug: libxl_pci.c:166:libxl__device_pci_add_xenstore: Adding new pci device to xenstore
libxl: debug: libxl_dom.c:1886:libxl__userdata_path: wetware
libxl: debug: libxl.c:673:libxl_domain_info: wetware, domid 28
libxl: debug: libxl.c:673:libxl_domain_info: wetware, domid 28
libxl: debug: libxl_pci.c:182:libxl__device_pci_add_xenstore: wetware pci_add_xenstore #5: domid 28
libxl: debug: libxl_internal.c:486:libxl__get_domain_configuration: wetware get_domain_conf #1
libxl: debug: libxl_dom.c:2041:libxl__userdata_retrieve: wetware
libxl: debug: libxl_dom.c:1886:libxl__userdata_path: wetware
libxl: debug: libxl.c:673:libxl_domain_info: wetware, domid 28
libxl: debug: libxl_dom.c:2043:libxl__userdata_retrieve: userdata path /var/lib/xen/userdata-d.28.9d60ea8f-65dd-43db-ab71-c2151cefdea2.libxl-json
libxl: error: libxl_internal.c:498:libxl__get_domain_configuration: wetware error: json config empty
libxl: error: libxl_pci.c:185:libxl__device_pci_add_xenstore: wetware pci_add_xenstore get_domain_conf failed
libxl: debug: libxl_pci.c:1049:do_pci_add: wetware #6: -16
libxl: error: libxl_pci.c:1197:libxl__device_pci_add: do_pci_add failed -16
libxl: error: libxl_create.c:1422:domcreate_attach_pci: libxl_device_pci_add failed: -16

So here new device is added to xenstore only in case of second functionality PCI device. This results in failure to execute get_domain_configuration (the json file seems to be there but is empty?). Any idea where it is created and why it wouldn't be filled with info?

The dmesg logs don't make much sense to me:

gpu pt with stubdom:
[ 4945.859947] xen-blkback: ring-ref 1789, event-channel 5, protocol 1 (x86_64-abi)
[ 4945.881926] xen-blkback: ring-ref 1788, event-channel 6, protocol 1 (x86_64-abi)
[ 4946.964655] xen_pciback: vpci: 0000:03:00.0: assign to virtual slot 0
[ 4946.965334] pciback 0000:03:00.0: registering for 28
[ 4946.977849] pciback 0000:03:00.1: is in use!
[ 4948.019597] pciback 0000:03:00.0: is in use!

usb controller pt in stubdom:
[ 5188.335601] xen-blkback: ring-ref 1789, event-channel 5, protocol 1 (x86_64-abi)
[ 5188.348248] xen-blkback: ring-ref 1788, event-channel 6, protocol 1 (x86_64-abi)
[ 5188.543378] xen_pciback: vpci: 0000:00:1a.0: assign to virtual slot 0
[ 5188.544168] pciback 0000:00:1a.0: registering for 30
[ 5188.564539] xen-pciback pci-29-0: 22 Couldn't locate PCI device (0000:00:1a.0)! perhaps already in-use?

marmarek commented 8 years ago

Interesting why it's calling get_domain_conf during domain startup at all. What happens when you assign only one function?

Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?

WetwareLabs commented 8 years ago

Correction: the json file is not there (error == ENOENT) and the error code was just misinterpreted.

When I assign only the GPU function, the VM starts (but no video output, so it fails somewhere farther down). At this point I can't use VNC so I can't test this any further. The VM starts also when assigning solely the USB controller OR only the audio device, so it's not only about a device having dual functions, but passing second device always seems to fail.

The junction is there at the _libxldevice_pci_addxenstore . When num_devs == NULL, then _libxlcreate_pcibackend is called and exited successfully. But when there are already 1 or more devices assigned, the function skips this and continues by calling _libxl__get_domainconfiguration which then ultimately fails.

QubesOS / qubes-issues

PCI passthrough not working for HVM domains #1659