cloudius-systems / osv

OSv, a new operating system for the cloud.
osv.io
Other
4.09k stars 602 forks source link

implement clone/3, set_robust_list and set_tid_address syscalls #1270

Closed wkozaczuk closed 11 months ago

wkozaczuk commented 11 months ago

This PR implements clone, clone3, set_robust_list, and set_tid_address syscalls needed to support running multi-threaded static executables on OSv.

The bulk of this patch is the implementation of the clone and its clone3 variant. More specifically the sys_clone() implements only the tiny subset of what the Linux manual describes - handling of CLONE_THREAD - which is what is used by glibc to implement pthread_create().

In essence, the sys_clone() creates a new thread, sets application TCB if present, and then when started new thread executes code implemented in assembly to restore most of the registers and jump to the instruction where the parent thread calling clone would execute next. So effectively a thread calling the clone syscall would "clone" itself by creating a new child thread that resumes in the same place in the code right after the syscall instruction which is held in the RCX register. All the registers to be restored in the child thread are copied from the frame of the parent thread syscall stack. The detailed comments explaining the implementation of clone() can be found intertwined with the code of sys_clone() in clone.cc.

This patch also implements two other related syscalls - set_robust_list and set_tid_address - which are mostly described here - https://www.kernel.org/doc/Documentation/robust-futexes.txt.

With this patch following simple example compiled as a static executable runs fine on OSv:

void* secondary(void *ignore)
{
    printf("secondary thread\n");
}

void main() {
    pthread_t threads[10];
    for (int i = 0; i < 10; i++)
       pthread_create(&threads[i], NULL, secondary, NULL);

    printf("Created 10 threads\n");

    for (int i = 0; i < 10; i++)
       pthread_join(threads[i], null);
    printf("Joined 10 threads\n");
}

Depends on the PR #1269

Fixes #1139