nanovms / nanos

A kernel designed to run one and only one application in a virtualized environment
https://nanos.org
Apache License 2.0
2.58k stars 133 forks source link

Missing `FUTEX_WAKE_BITSET` implementation #1988

Closed ls-1801 closed 8 months ago

ls-1801 commented 8 months ago

During stress testing my application as a nanos unikernel, I noticed crashes because of a missing FUTEX_WAKE_BITSET implementation.

  1. Can a stack trace be printed that tells me where it was called? It should be coming from the folly library. However, the scenario would not occur when a debugger is attached.
  2. Is implementing the FUTEX_WAKE_BITSET function possible, or is this mask functionality not implemented? (I couldn't find it, although FUTEX_WAIT_BITSET does exist)
francescolavra commented 8 months ago
  1. You would need to make a custom change to the Nanos kernel and rebuild the kernel; more specifically, you can add a call to dump_context(current->context); in the futex() function at https://github.com/nanovms/nanos/blob/master/src/unix/futex.c#L347. This would print a frame trace and stack trace of the user program when it invoked the FUTEX_WAKE_BITSET operation. If you add the "ingest_program_symbols" flag to your Ops configuration file (as in { "Debugflags":["ingest_program_symbols"] }, see https://docs.ops.city/ops/configuration), the trace will include the program function names (unless the program executable has been stripped of the ELF symbol names); however, if the futex operation came from a dynamically linked library, you won't see the function names from that library.
  2. The bit mask functionality for futex wait and wake operations is currently not implemented in the kernel, but it is definitely possible to add it in the future.
ls-1801 commented 8 months ago

´Thank you, that was very helpful, I was able to remove calls to FUTEX_WAKE_BITSET.