kaniini / libucontext

ucontext implementation featuring glibc-compatible ABI
Other
102 stars 41 forks source link

"Aieie, swapcontext() failed ..." on arm_cortex-a7_neon-vfpv4 using libmariadb #55

Closed VolkerChristian closed 1 month ago

VolkerChristian commented 1 year ago

Hello,

I am currently working in integrating libucontext (v1.2 and not HEAD) into openwrt (v23.05) and compile the libmariadb c-connector against it. As openwrt uses musl libucontext is necessary to make the asynchronous API of the c-connector working. I need this async API in a project (SNodeC) which I want to bring to openwrt.

So far I have been successful in creating a package of libucontext (https://github.com/VolkerChristian/owrt-packages/tree/libucontext/libs/libucontext) for openwrt (v23.05-rc3) and also in patching the libmariadb package in openwrt to link agains libucontext_posix and libucontext (https://github.com/VolkerChristian/owrt-packages/tree/libmariadb-ucontext/libs/libmariadb) for all supported targets of openwrt . At least the packages compile and link successful on all architectures, thus I assume that I have chosen the correct libucontext-architecture for the individual openwrt architectures.

As I only have access to routers with architecture aarch64 and arm_cortex-a7-neon-vfpv4 run tests where only done on these architectures.

On aarch64 everything is fine and the async API of libmariadb is working as expected.

But on arm_cortex-a7_neon-vfpv4 I get the error message Aieie, swapcontext() failed: -1231365824 (errno=2)

The underlying code in libmariadb, located in the ma_context.c file, that generates this message is

int
my_context_continue(struct my_context *c)
{
  int err;

  if (!c->active)
    return 0;

  err= swapcontext(&c->base_context, &c->spawned_context);
  if (err)
  {
    fprintf(stderr, "Aieie, swapcontext() failed: %d (errno=%d)\n",
            err, errno);
    return -1;
  }

  return c->active;
}

In addition I want to note that the calls to sigprocmask in libucontext_posix succeeds, checked with strace. So this calls are not the root cases of this error.

Unfortunately, I'm not an assembler coder and therefore don't really know what's going wrong here.

Can someone please give me a hint how to fix this error? Or are additional infos necessary?

Thank you very much in advance and best regards Volker

VolkerChristian commented 1 year ago

Yesterday I got two additional router for testing. An arm_cortex-a15_neon-vfpv4 (Netgear-R7800) and an mipsel_24kc (TP-Link Archer A7-v5) based router.

On the mipsel_24kc based router the async API of libmariadb is working fine. On the arm_cortex-a15_neon-vfpv4 based router libmariadb fails with the same error as on the arm_cortex-a7_neon-vfpv4 router.

I also tried to compile HEAD of libucontext (with the "Add ARM hard-float support"-patch" #40) patch. But it did not compile. Error is:

arch/arm/getcontext.S: Assembler messages:
arch/arm/getcontext.S:49: Error: junk at end of line, first unrecognized character is `l'

If this is of any help: The include-path-stripped part of the openwrt cross-compile command is

arm-openwrt-linux-muslgnueabi-gcc -fPIC -DPIC -Os -pipe -fno-caller-saves -fno-plt -fhonour-copts -mfloat-abi=hard -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro  -Iinclude -Iarch/arm -Iarch/common -DEXPORT_UNPREFIXED -Wa,--noexecstack -c -o arch/arm/getcontext.o arch/arm/getcontext.S

for getcontext.S and also for setcontext.S and swapcontext.S. The gcc-version is 12.3.0.

Any help would be appreciated!

Best regards Volker

VolkerChristian commented 1 year ago

Finally I have identified this issue as a bug in the code of swapcontext.S. From the posix specs of get/set/swapcontext this functions must return zero on success but r0 (the return value register) is not cleared. The pull request fix this behaviour.

I leave this issue open until either the pull request has been merged or some one has a better solution for this bug.