This is a controversial proposal and we may never do it, but I raise it here because it would help us better solve one old problem and another new one.
The old problem is the support of the so-called "local exec thread-local mode" where the executable assumes that it is the first loaded object and therefore offset to its TLS block (relative to fs:0) is known at compile time. The kernel assumes the same and as a result TLS of the executable and the kernel overlap. The issue is described here and has been solved by the kernel leaving a "reservation" for the app at the beginning of the TLS block. This is not ideal as the default reservation may not be enough (see here) and needs to be adjusted by passing the build parameter app_local_exec_tls_size.
The new problem has to do with supporting statically linked executables and dynamically linked ones launched by ld-linux. In this case, such apps would use TLS including the local-exec one and OSv does not really have any control of the bootstrapping mechanism (where the memory is allocated, how much, etc). In the x86_64 case the app would call the arch_prctl syscall to request the FS register to be set to a specific value. But then OSv has to implement extra logic to juggle between the kernel and app value of the FS register on the syscall, interrupt, and page fault switch. This is all doable as I did it my experimental branch but pretty painful and adds extra cost.
So what if we make kernel not to use FS register at all? But how? We could use the GS register (like Linux kernel does), but then what about all the __thread variables the kernel scheduler and many other parts depend on and the compiler automagically handles read and write access to?
Use assembly with gs:xxx code
Is there a compiler option to make __thread modifier generate code using GS instead of FS?
Here are all the thread-local variables kernel seems to be using (based on readelf -W -s build/release/loader-stripped.elf | grep TLS:
This is a controversial proposal and we may never do it, but I raise it here because it would help us better solve one old problem and another new one.
The old problem is the support of the so-called "local exec thread-local mode" where the executable assumes that it is the first loaded object and therefore offset to its TLS block (relative to
fs:0
) is known at compile time. The kernel assumes the same and as a result TLS of the executable and the kernel overlap. The issue is described here and has been solved by the kernel leaving a "reservation" for the app at the beginning of the TLS block. This is not ideal as the default reservation may not be enough (see here) and needs to be adjusted by passing the build parameterapp_local_exec_tls_size
.The new problem has to do with supporting statically linked executables and dynamically linked ones launched by
ld-linux
. In this case, such apps would use TLS including the local-exec one and OSv does not really have any control of the bootstrapping mechanism (where the memory is allocated, how much, etc). In thex86_64
case the app would call thearch_prctl
syscall to request the FS register to be set to a specific value. But then OSv has to implement extra logic to juggle between the kernel and app value of the FS register on the syscall, interrupt, and page fault switch. This is all doable as I did it my experimental branch but pretty painful and adds extra cost.So what if we make kernel not to use FS register at all? But how? We could use the GS register (like Linux kernel does), but then what about all the
__thread
variables the kernel scheduler and many other parts depend on and the compiler automagically handles read and write access to?gs:xxx
code__thread
modifier generate code using GS instead of FS?Here are all the thread-local variables kernel seems to be using (based on
readelf -W -s build/release/loader-stripped.elf | grep TLS
: