mer-hybris / libgbinder

GLib-style interface to binder
BSD 3-Clause "New" or "Revised" License
51 stars 40 forks source link

Tests fail to run on s390x #108

Closed aleasto closed 1 year ago

aleasto commented 1 year ago

Realistically there's no current use for binder in s390x, but regardless it's unclear why these tests fail, especially since they don't actually use the binder kernel module:

make[1]: Entering directory '/builddir/build/BUILD/libgbinder-1.1.29/unit/unit_bridge'
# random seed: R02Sa0674ede83f3e5a74ea2c8a4e5243023
1..2
# Start of bridge tests
ok 1 /bridge/null
Bail out! ERROR:../common/test_main.c:52:test_timeout_expired: assertion failed: (!"TIMEOUT")
make[1]: Leaving directory '/builddir/build/BUILD/libgbinder-1.1.29/unit/unit_bridge'
**
ERROR:../common/test_main.c:52:test_timeout_expired: assertion failed: (!"TIMEOUT")
make[1]: *** [../common/Makefile:165: test] Aborted (core dumped)
make: *** [Makefile:5: test] Error 2

Full log (I don't know how long it will stay up): https://kojipkgs.fedoraproject.org//work/tasks/5078/94705078/build.log

monich commented 1 year ago

What's s390x and if you have it at your disposal, could you please provide the output of build/debug/unit_bridge -v for the failing test?

aleasto commented 1 year ago

It's a processor architecture by IBM. I don't have one but I have the power of podman+qemu-user :stuck_out_tongue: The error instead comes from fedora build servers, where they have physical s390x mainframes

This is my verbose log. https://gist.githubusercontent.com/aleasto/49ece20446b0a38cad9078cdd7400b89/raw/1168be1d425e45d06d443c03f0a4eca0eb038fd6/gistfile1.txt

If you want to try yourself this is my setup: docker run -it library/fedora:37@sha256:a7fcd22632ccd8e88e19940725fc3a84ac5e948b37498e41ae879c8a60bc6136 and in the container dnf install -y libglibutil libglibutil-devel git gcc make flex bison

aleasto commented 1 year ago

I've tried a virtual machine on qemu-system-s390x and libgbinder doesn't work on the real binder driver either: [ 9445.366942] binder_linux: 42450:42450 transaction failed 29189/-22, size 0-0 line 2987

The first thing that comes to mind is that s390x is big-endian, but that's just a guess. I've tried ppc64le and it works there.

sharkcz commented 1 year ago

If you need access to the Fedora developer s390x machine, please let me know.

monich commented 1 year ago

The first thing that comes to mind is that s390x is big-endian, but that's just a guess. I've tried ppc64le and it works there.

All the real life usage so far has been little-endian. I've been trying to keep in mind potential byte order issues but obviously, without real testing on a big-endian platform, something might have slipped through. Shouldn't be a huge problem. Hopefully.

monich commented 1 year ago

Could you please try #110 and see if that fixes the problem with units tests. With this new binder simulation I'm no longer able to crash unit tests, even by running them thousands of times in a loop (current master does eventually crash)

aleasto commented 1 year ago

Still times out: https://gist.githubusercontent.com/aleasto/aed9313f924fadcf167110f652749eab/raw/a71c6a0c7cfcc580a9ed6f289802fb83063d04d8/gistfile1.txt

monich commented 1 year ago

Damn. Please take another log with GUTIL_LOG_TID=1 build/debug/unit_bridge -v. That would give a better idea what's going on between the threads involved.

aleasto commented 1 year ago

https://gist.githubusercontent.com/aleasto/da422491e55b3629dea9e0fcf6aca2a2/raw/b761dcd10ac37c3cc825f7e2b44f8f37f088a25f/gistfile1.txt

sharkcz commented 1 year ago

@monich , access granted, please check your mailbox (also your spam box, because gmail doesn't like me :-))

monich commented 1 year ago

Please try #110 again, it works for me now on that s390x thing. There were indeed a few issues with the byte order.

aleasto commented 1 year ago

Thank you. The update is on its way to fedora