nutanix / libvfio-user

framework for emulating devices in userspace
BSD 3-Clause "New" or "Revised" License
162 stars 51 forks source link

Fails to build on Fedora i686 and ppc64le #707

Closed etrunko closed 1 year ago

etrunko commented 2 years ago

I'm working on the qemu-7.1.0 package for Fedora, and got these errors building vfio-user on those arches:

i686: https://kojipkgs.fedoraproject.org//work/tasks/3290/91503290/build.log ppc64le: https://kojipkgs.fedoraproject.org//work/tasks/3293/91503293/build.log

For the other supported arches, the build passes: https://koji.fedoraproject.org/koji/taskinfo?taskID=91503268

tmakatos commented 2 years ago

~@etrunko for i686, can you check whether https://github.com/nutanix/libvfio-user/pull/709 fixes it? I haven't had a chance to actually build it for i686 yet so not sure it will work.~ Scratch that, I might not have fixed all the errors.

tmakatos commented 1 year ago

@etrunko could you check https://github.com/nutanix/libvfio-user/pull/709 ?

jlevon commented 1 year ago

@etrunko also is there any way we could run libvfio-user CI (as part of qemu build?) on your infra? github don't provide anything we can use to make sure this doesn't regress

etrunko commented 1 year ago

@etrunko could you check #709 ?

I'll report it back as soon as I can, thanks

etrunko commented 1 year ago

@etrunko also is there any way we could run libvfio-user CI (as part of qemu build?) on your infra? github don't provide anything we can use to make sure this doesn't regress

Ideally you should provide this library as a standalone fedora package, so we can make thing easier for both vfio-user and qemu, as you'll be able to decouple the release from qemu and also test things early. It can be also included as part of the virtmaint-sig copr, where you can run builds easier before actually building the package officially for the distro.

If you start with a spec file we will be happy to help with the reviews to get the package included in the next Fedora.

etrunko commented 1 year ago

Looks like we are good now with code errors, but the i686 build still fails on the tests related to libvfio-user.

https://koji.fedoraproject.org/koji/taskinfo?taskID=92194687

In the log you can see some warnings:

[2526/10466] gcc -m32 -Ilibcommon.fa.p -I../common-user/host/i386 -I../linux-user/include/host/i386 -I../linux-user/include -Isubprojects/libvfio-user/include -I../subprojects/libvfio-user/include -I/usr/include/capstone -I/usr/include/pixman-1 -I/usr/include/libpng16 -I/usr/include/spice-server -I/usr/include/spice-1 -I/usr/include/cacard -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/sysprof-4 -I/usr/include/nss3 -I/usr/include/nspr4 -I/usr/include/PCSC -I/usr/include/p11-kit-1 -I/usr/include/json-c -I/usr/include/SDL2 -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/gio-unix-2.0 -I/usr/include/slirp -fdiagnostics-color=auto -Wall -Winvalid-pch -std=gnu11 -O2 -g -isystem /builddir/build/BUILD/qemu-7.1.0/linux-headers -isystem linux-headers -iquote . -iquote /builddir/build/BUILD/qemu-7.1.0 -iquote /builddir/build/BUILD/qemu-7.1.0/include -iquote /builddir/build/BUILD/qemu-7.1.0/tcg/i386 -pthread -DSTAP_SDT_V2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 -Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -O2 -fexceptions -g -grecord-gcc-switches -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse -mstackrealign -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fPIE -D_REENTRANT -Wno-undef -DSTRUCT_IOVEC_DEFINED -MD -MQ libcommon.fa.p/hw_remote_vfio-user-obj.c.o -MF libcommon.fa.p/hw_remote_vfio-user-obj.c.o.d -o libcommon.fa.p/hw_remote_vfio-user-obj.c.o -c ../hw/remote/vfio-user-obj.c
../hw/remote/vfio-user-obj.c: In function 'dma_register':
../hw/remote/vfio-user-obj.c:308:28: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
  308 |                            (uint64_t)info->vaddr);
      |                            ^
../hw/remote/vfio-user-obj.c:317:47: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
  317 |     memory_region_add_subregion(dma_as->root, (hwaddr)iov->iov_base, subregion);
      |                                               ^
../hw/remote/vfio-user-obj.c:319:28: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
  319 |     trace_vfu_dma_register((uint64_t)iov->iov_base, iov->iov_len);
      |                            ^
../hw/remote/vfio-user-obj.c: In function 'dma_unregister':
../hw/remote/vfio-user-obj.c:340:30: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
  340 |     trace_vfu_dma_unregister((uint64_t)info->iova.iov_base);
      |                              ^

And in the end the test failing:

640/662 libvfio-user:functional / test-client-server                              FAIL            2.28s   exit status 1
――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
stdout:
client: devinfo: flags 0x3, num_regions 10, num_irqs 5
client: get_device_region_info: region_info[0] offset 0 flags 0x3 size 4 cap_sz 0 #FDs 0
client: get_device_region_info: region_info[1] offset 0 flags 0xf size 12288 cap_sz 48 #FDs 2
client: get_region_vfio_caps: Sparse cap nr_mmap_areas 2
client: get_region_vfio_caps: area 0 offset 0 size 4096
client: get_region_vfio_caps: area 1 offset 0x2000 size 4096
client: get_device_region_info: region_info[2] offset 0 flags 0 size 0 cap_sz 0 #FDs 0
client: get_device_region_info: region_info[3] offset 0 flags 0 size 0 cap_sz 0 #FDs 0
client: get_device_region_info: region_info[4] offset 0 flags 0 size 0 cap_sz 0 #FDs 0
client: get_device_region_info: region_info[5] offset 0 flags 0 size 0 cap_sz 0 #FDs 0
client: get_device_region_info: region_info[6] offset 0 flags 0 size 0 cap_sz 0 #FDs 0
client: get_device_region_info: region_info[7] offset 0 flags 0 size 256 cap_sz 0 #FDs 0
client: get_device_region_info: region_info[8] offset 0 flags 0 size 0 cap_sz 0 #FDs 0
client: get_device_region_info: region_info[9] offset 0x3000 flags 0xf size 20480 cap_sz 48 #FDs 1
client: migration region
client: get_region_vfio_caps: Sparse cap nr_mmap_areas 1
client: get_region_vfio_caps: area 0 offset 0x1000 size 16384
stderr:
server[115386]: bad region index 3735928559
server[115386]: msg0xf00f: cmd 9 failed: Invalid argument
client: failed to read from region -559038737 0-0x3: Invalid argument
server[115386]: devinfo flags 0x3, num_regions 10, num_irqs 5
server[115386]: region_info[0] offset 0 flags 0x3 size 4 argsz 32
server[115386]: region_info[1] offset 0 flags 0x7 size 12288 argsz 80
server[115386]: dev_get_caps: area 0 [(nil), 0x1000)
server[115386]: dev_get_caps: area 1 [0x2000, 0x3000)
server[115386]: region_info[1] offset 0 flags 0xf size 12288 argsz 80
server[115386]: region_info[2] offset 0 flags 0 size 0 argsz 32
server[115386]: region_info[3] offset 0 flags 0 size 0 argsz 32
server[115386]: region_info[4] offset 0 flags 0 size 0 argsz 32
server[115386]: region_info[5] offset 0 flags 0 size 0 argsz 32
server[115386]: region_info[6] offset 0 flags 0 size 0 argsz 32
server[115386]: region_info[7] offset 0 flags 0 size 256 argsz 32
server[115386]: region_info[8] offset 0 flags 0 size 0 argsz 32
server[115386]: region_info[9] offset 0x3000 flags 0x7 size 20480 argsz 80
server[115386]: dev_get_caps: area 0 [0x1000, 0x5000)
server[115386]: region_info[9] offset 0x3000 flags 0xf size 20480 argsz 80
server[115386]: region7: read 64 bytes at 0
server[115386]: device reset by client
server[115386]: device reset callback
server[115386]: adding DMA region [0, 0x1000) offset=0 flags=0x3
server[115386]: DMA region size 4096 > max 0
server[115386]: failed to add DMA region [0, 0x1000) offset=0 flags=0x3: No space left on device
server[115386]: msg0x1234: cmd 2 failed: No space left on device
client: failed to map DMA regions: No space left on device
server[115386]: failed to receive request header: No message of desired type
tmakatos commented 1 year ago

@etrunko also is there any way we could run libvfio-user CI (as part of qemu build?) on your infra? github don't provide anything we can use to make sure this doesn't regress

Ideally you should provide this library as a standalone fedora package, so we can make thing easier for both vfio-user and qemu, as you'll be able to decouple the release from qemu and also test things early. It can be also included as part of the virtmaint-sig copr, where you can run builds easier before actually building the package officially for the distro.

If you start with a spec file we will be happy to help with the reviews to get the package included in the next Fedora.

I don't see anything wrong with that, but would like to hear @jlevon 's view on this.

tmakatos commented 1 year ago

@etrunko can you try again with https://github.com/nutanix/libvfio-user/pull/709?

etrunko commented 1 year ago

@etrunko can you try again with #709?

Is it possible to squash some commits? Otherwise I'll need to add each of the 13 patches manually to qemu.spec.

etrunko commented 1 year ago

So, I did squash all patches into only one, and build now passes.

https://koji.fedoraproject.org/koji/taskinfo?taskID=92273849

tmakatos commented 1 year ago

Great, once this PR gets reviewed all commits will be squashed and you'll only have to add one patch to the spec file.

jlevon commented 1 year ago

If you start with a spec file we will be happy to help with the reviews to get the package included in the next Fedora.

We cannot package this library like this as there's not yet a stable ABI. Spent way too much time of my life on packages like this that don't care about compatibility / dependency hell etc, so don't want to repeat that mistake here.

etrunko commented 1 year ago

We cannot package this library like this as there's not yet a stable ABI. Spent way too much time of my life on packages like this that don't care about compatibility / dependency hell etc, so don't want to repeat that mistake here.

You have a good point, but you can still maintain the spec file and make use of the fedora infrastructure, like building for different architectures, via copr, without having it officially included in the distribution.

etrunko commented 1 year ago

Closing as #709 has been merged.

jlevon commented 1 year ago

you can still maintain the spec file and make use of the fedora infrastructure, like building for different architectures, via copr, without having it officially included in the distribution.

So if we figure out a spec file, what would be the next steps? Thanks.

etrunko commented 1 year ago

I gave it another try to enable vfio-user in qemu-7.2.0, but it still fails with i686 and ppc64le. Please check the build logs:

i686: https://kojipkgs.fedoraproject.org//work/tasks/3655/95563655/build.log ppc64le: https://kojipkgs.fedoraproject.org//work/tasks/3658/95563658/build.log

Is this a new issue or is it just that the subproject has not been updated in qemu?

etrunko commented 1 year ago

Is this a new issue or is it just that the subproject has not been updated in qemu?

Yeah, it looks like the qemu subproject has not been updated between 7.1 and 7.2. But anyway, I noticed the subproject URL points to a different git repo:

https://gitlab.com/qemu-project/qemu/-/blob/master/.gitmodules#L66

I created an issue in qemu updating the mirror.