cloudius-systems / osv

OSv, a new operating system for the cloud.
osv.io
Other
4.12k stars 605 forks source link

Broken named semaphores support #1258

Closed wkozaczuk closed 1 year ago

wkozaczuk commented 1 year ago

Recently added support of the named semaphores support is broken or the relevant part of the tst-semaphore.cc has a bug.

First of all, when running the tst-semaphore.cc, one gets the following error:

Thread 5: Incremented 10th
Thread 7: Incremented 10th
Thread 1: Incremented 10th
/tests/tst-semaphore.so: failed looking up symbol sem_open

[backtrace]
0x00000000402c2680 <elf::object::symbol(unsigned int, bool)+1520>
0x00000000402c350f <elf::object::resolve_pltgot(unsigned int)+127>
0x00000000402c3833 <elf_resolve_pltgot+51>
0x00000000402f9622 <???+1076860450>
0x0000200000200e8f <???+2100879>
0x0000600000bf9c0f <???+12557327>

This is actually caused by the fact the signature of the sem_open is different in musl header than it is in the sem.cc (as a matter of fact the last 2 arguments are only required with O_CREAT).

Applying this patch fixes this issue:

diff --git a/libc/sem.cc b/libc/sem.cc
index 590f5ae3..0b83f7c4 100644
--- a/libc/sem.cc
+++ b/libc/sem.cc
@@ -110,7 +110,7 @@ static std::unordered_map<std::string, indirect_semaphore> named_semaphores;
 static mutex named_semaphores_mutex;

 OSV_LIBC_API
-sem_t *sem_open(const char *name, int oflag, mode_t mode, unsigned int value)
+sem_t *sem_open(const char *name, int oflag, ...)
 {
     SCOPE_LOCK(named_semaphores_mutex);
     auto iter = named_semaphores.find(std::string(name));
@@ -127,6 +127,11 @@ sem_t *sem_open(const char *name, int oflag, mode_t mode, unsigned int value)
     }
     else if (oflag & O_CREAT) {
         //creating new semaphore
+        va_list ap;
+        va_start(ap, oflag);
+        va_arg(ap, mode_t);
+        unsigned value = va_arg(ap, unsigned);
+        va_end(ap);
         if (value > SEM_VALUE_MAX) {
             errno = EINVAL;
             return SEM_FAILED;

But then we get another crash:

Thread 1: Incremented 10th
Thread 9: Incremented 10th
Thread 10: Incremented 10th
Thread 3: Incremented 10th
Thread 8: Incremented 9th
Thread 8: Incremented 10th
Thread 4: Incremented 10th
Thread 6: Incremented 10th
Thread 7: Incremented 10th
Thread 5: Incremented 10th
Aborted

[backtrace]
0x0000000040216ca5 <???+1075932325>
0x00000000403d205c <osv::handle_mmap_fault(unsigned long, int, exception_frame*)+28>
0x00000000402aae9b <mmu::vm_fault(unsigned long, exception_frame*)+315>
0x00000000402faa2b <page_fault+139>
0x00000000402f99b6 <???+1076861366>
0x00001000000012c7 <???+4807>
0x909d3eebf6bfcd61 <???+-155202207>
0x0000000000000000 <???+0>
0x00000000403d50df <???+1077760223>
0xef0f661974f83947 <???+1962424647>
qemu failed.

(gdb) bt
#0  0x0000000040301362 in processor::cli_hlt () at arch/x64/processor.hh:247
#1  arch::halt_no_interrupts () at arch/x64/arch.hh:61
#2  osv::halt () at arch/x64/power.cc:29
#3  0x00000000402370f0 in abort (fmt=fmt@entry=0x405c3b13 "Aborted\n") at runtime.cc:145
#4  0x0000000040202988 in abort () at runtime.cc:106
#5  0x0000000040216ca6 in osv::generate_signal (siginfo=..., ef=ef@entry=0x400000f5a078) at libc/signal.cc:130
#6  0x00000000403d205d in osv::handle_mmap_fault (addr=addr@entry=35184380485632, sig=sig@entry=11, ef=ef@entry=0x400000f5a078) at libc/signal.cc:145
#7  0x00000000402aae9c in mmu::vm_sigsegv (ef=0x400000f5a078, addr=35184380485632) at core/mmu.cc:1428
#8  mmu::vm_fault (addr=addr@entry=35184380485632, ef=ef@entry=0x400000f5a078) at core/mmu.cc:1456
#9  0x00000000402faa2c in page_fault (ef=0x400000f5a078) at arch/x64/mmu.cc:42
#10 <signal handler called>
#11 0x0000200000802000 in ?? ()
#12 0x00000000403d49c6 in std::default_delete<posix_semaphore>::operator() (__ptr=<optimized out>, this=<optimized out>) at /usr/include/c++/12/bits/unique_ptr.h:89
#13 std::unique_ptr<posix_semaphore, std::default_delete<posix_semaphore> >::~unique_ptr (this=0x600000f2c668, __in_chrg=<optimized out>) at /usr/include/c++/12/bits/unique_ptr.h:396
#14 sem_destroy (s=0x600000f2c668) at libc/sem.cc:66
#15 sem_close (sem=sem@entry=0x600000f2c668) at libc/sem.cc:188
#16 0x00001000000012c8 in main () at /home/wkozaczuk/projects/osv-master/tests/tst-semaphore.c:75
#17 0x0000000040397e6d in osv::application::run_main (this=0x600000bf9c10) at core/app.cc:426
#18 0x0000000040398059 in operator() (app=<optimized out>, __closure=0x0) at core/app.cc:236
#19 _FUN () at core/app.cc:238
#20 0x00000000403cdde6 in operator() (__closure=0x600000d76500) at libc/pthread.cc:116
#21 std::__invoke_impl<void, pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::<lambda()>&> (__f=...)
    at /usr/include/c++/12/bits/invoke.h:61
#22 std::__invoke_r<void, pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::<lambda()>&> (__fn=...)
    at /usr/include/c++/12/bits/invoke.h:154
#23 std::_Function_handler<void(), pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/12/bits/std_function.h:290
#24 0x00000000403684d8 in sched::thread::main (this=0x400000f55040) at core/sched.cc:1409
#25 sched::thread_main_c (t=0x400000f55040) at arch/x64/arch-switch.hh:350
#26 0x00000000402fa933 in thread_main () at arch/x64/entry.S:116
(gdb) quit

On top of fixing this, we also need to add all new sem_* functions to the exported_symbols/osv_libpthread.so.0.symbols file.

nyh commented 1 year ago

Thanks @wkozaczuk sorry for not being more diligent during my review of #1232. I'll commit your patch above to fix the function signature, and see if I can figure out the remaining bug.

@StonewallJohnson, for future reference to run the OSv test suite, you can run "make check". You can also run a single test like the one that Waldek noticed here is failing:

scripts/build image=tests
scripts/run.py -e tests/tst-semaphore.so
nyh commented 1 year ago

On top of fixing this, we also need to add all new sem_* functions to the exported_symbols/osv_libpthread.so.0.symbols file.

I don't know yet how to do that :-( Can you please prepare a patch to do that? Yesterday I also sent a patch to add strfromf128 et al. - is this something we'll also need to some exported symbol file?

nyh commented 1 year ago

I cut the test to only

    sem_t *named_sem3 = sem_open("other", O_EXCL | O_CREAT | O_SYNC, 0777, 1);
    assert(named_sem3 != SEM_FAILED);
    assert(sem_unlink("other") == 0);
    assert(sem_close(named_sem3) == 0);

And just that crashes. Trying to debug it...

nyh commented 1 year ago

I think I know where the bug is:

name_semaphores holds unique_ptr<posix_semaphore> objects. This means that as soon as we unlink a named seamphore, and it's removed from the hash table, the unique_ptr is destroyed... It shouldn't. named_semaphores needs to hold pointers to unique_ptr, not the unique_ptr themselves.

I'll prepare a patch.