std.crypto.random isn't guaranteed to work on freestanding/UEFI targets because of TLS

drew-gpf commented 3 years ago

Essentially, std.crypto.random relies on threadlocal/TLS which may or may not be implemented depending on what platform the code is running on. The --single-threaded switch kind of fixes this by just turning threadlocal declarations into globalvars with the cost of, for example, std.Mutex not working; in other words, the current assumption is that any non-single-threaded environment is guaranteed to support TLS, which is incorrect. Most (all?) OS kernels, for example, won't bother implementing TLS for kernel code; it also affects UEFI targets, albeit extremely rarely, as they can use the MP protocol or manually bootstrap processors for simple multithreading.

An example of why this is problematic can be found with this x86_64-freestanding application:

const std = @import("std");
const rand = std.crypto.random;

export fn _start() void {
    _ = rand.int(u32);
}

pub fn cryptoRandomSeed(buffer: []u8) void {
    for (buffer) |*val| { // placeholder entropy generation
        val.* = 0;
    }
}

This compiles just fine and seems correct but litters the code with fs:[0] references which could mean absolutely anything depending on what context this is being executed on. With my hobby OS, for example, it could either dereference NULL or extract and start dereferencing pointers from an extremely critical application-controlled register context structure - leading to unpredictable behavior or some kind of exploit.

A simple and desirable fix would be to add an option similar to --single-threaded that just treats threadlocal vars as global, where --single-threaded would then imply that hypothetical switch. This however introduces race conditions if the caller doesn't explicitly lock/synchronize function calls.

The switch could also be used to throw an error during compilation if threadlocal is used at all, which would prevent accidental race conditions with the downside being that the code can't be used without modification.

Another option is to change std.crypto.random - and anything that relies on threadlocal in the future - to have a non-TLS variant where either globalvars with synchronization or a caller-defined mechanism for storing data would be used (although ideally something like this would just store data locally which could work for anyone). This way, the threadlocal-error-switch would also seem more justified; no ill-formed programs would then be able to accidentally exist.

Suggestions or criticism is appreciated. This isn't too important but it seemed like a bummer that std.crypto.random couldn't actually be used in a freestanding environment, especially given that it inherently should just work(TM).

Aransentin commented 3 years ago

Notably old ARM CPUs don't support TLS either, which can be kind of a bummer since Zig is otherwise extremely good for embedded work.

andrewrk commented 3 years ago

Hmm what about exposing the ability to override cryptoRandom directly rather than cryptoRandomSeed? This way the application/OS would be in complete control of the code path any time std.crypto.random was called.

It would even be able to use @compileError to outlaw use of that API entirely.

N00byEdge commented 3 years ago

In my humble opinion, using TLS on any freestanding target should be a hard error, and that we should wait for the address space pointer attributes if someone wants to use fs:[] references. Not sure what could be done with the code that uses it though.

daurnimator commented 3 years ago

Isn't the solution to implement TLS on freestanding targets as part of implementing compiler_rt for that target?

drew-gpf commented 3 years ago

Hmm what about exposing the ability to override cryptoRandom directly rather than cryptoRandomSeed? This way the application/OS would be in complete control of the code path any time std.crypto.random was called.

This could work. I was also thinking it could be made easier to implement by having a CryptoRandom struct that stores the relevant information within itself as to allow all state to be handled by the application itself e.g. the struct and its data could be automatically stored on the stack, heap, etc. This is preferable to giving it an allocator as it also allows you to shove it into a per-CPU structure which would act as an efficient form of manual TLS. The application-defined std.crypto.random implementation could then just use that CryptoRandom struct for identical functionality.

It would even be able to use @compileError to outlaw use of that API entirely.

I disagree with this being the way of handling the TLS issue as it can still lead to silent fs:[offs] in other threadlocal code or if the application dev is just not aware of this. I think the better solution is to emit a compiler error if threadlocal is used at all which automatically handles this case and every other possible case. I also think however that TLS usage should be behind a switch with a name like --freestanding-uefi-whatever-tls-supported where freestanding/UEFI/etc targets by default don't allow TLS to be used unless the implementation pinky-swears that it's implemented.

A way to do this which would also work with C code is to force TLS emulation to be used which will cause link errors since the application would then need to define TLS emulation functions; accomplishing essentially the same thing. IIRC this is done by MSVC for UEFI targets. There would still ideally be a switch for these targets to say that they don't need TLS emulation.

Isn't the solution to implement TLS on freestanding targets as part of implementing compiler_rt for that target?

I'm not sure what you mean exactly but you just can't handle TLS in this context. Say Zig were to be used for an OS device driver (Windows NT, Linux, etc); you can't freely write to the FS register base to facilitate TLS - it's used by applications and would have to be set after every single context switch for every single Zig driver - and those platforms don't support it in kernel code. If TLS emulation is forced, you still wouldn't have a reliable way of automatically determining thread/processor/whatever ID (let alone the logical errors which come with this - how would interrupt routines support this?), and so on, and so on. It's just a bad idea to implement this automatically for these targets.

I hope the attitude towards this never becomes "any freestanding target built with Zig is actually an OS kernel and so if you want to use certain pieces of code without silent failure you have to implement automatic TLS support" because 99% of the time compiler-supported TLS is either impossible or impractical to implement.

daurnimator commented 3 years ago

I'm not sure what you mean exactly but you just can't handle TLS in this context. Say Zig were to be used for an OS device driver (Windows NT, Linux, etc); you can't freely write to the FS register base to facilitate TLS

How to read/write TLS is defined per target; for platforms that don't support TLS natively (such as the examples you gave) then we should be falling back to 'emutls'. Currently the implementation in std/special/compiler_rt/emutls.zig requires libc, and may be openbsd specific. But I'm proposing that as a general fallback (and hence for unknown targets), emutls is used; with a missing implementation resulting in a compile error, or if that's not possible, a panic.

drew-gpf commented 3 years ago

How to read/write TLS is defined per target; for platforms that don't support TLS natively (such as the examples you gave) then we should be falling back to 'emutls'. Currently the implementation in std/special/compiler_rt/emutls.zig requires libc, and may be openbsd specific. But I'm proposing that as a general fallback (and hence for unknown targets), emutls is used; with a missing implementation resulting in a compile error, or if that's not possible, a panic.

Sorry if I seemed rude. This actually sounds like the ideal way to do it. I think libc/whatever should be a requirement to prevent accidental misuse and that missing implementation (which should happen by default for these targets) could be overridden by the freestanding application by emulating the functionality it uses libc for.

drew-gpf commented 3 years ago

Actually the OpenBSD comment was right since it would work weird on UEFI which uses PE and not ELF, so maybe a variant for PE files could be added too.

Khitiara commented 5 months ago

semi-related: #19943 can address for most freestanding since most freestanding platforms will override the getrandom

ziglang / zig

std.crypto.random isn't guaranteed to work on freestanding/UEFI targets because of TLS #9686