thinking about alternatives to sodium_init

oconnor663 commented 4 years ago

As I get more experience working with libsodium, I find myself wishing that sodium_init was implicit and automatic rather than explicit. The biggest issue it causes for me is that it leaks through any abstraction created on top. For example, if I want to build a hash table library using crypto_shorthash, my library now needs to expose and document a global init function that wraps sodium_init. Another issue I've run into is that, counterintuitively, most of libsodium seems to work just fine if you don't call sodium_init. That makes me worry that lots of callers simply ignore the requirement, which I think exposes them to obscure race conditions on non-Posix platforms that they're very unlikely to uncover in testing.

Have you considered checking for initialization implicitly in every API function? (Or at least, at every point where we reach for a global data structure.) I assume the best way to do this would be with some global atomic flag, possible wrapped up in a "once" sync primitive, with the cost of checking it being a single read-acquire in the common case. I think that's cheap enough that it would disappear into the noise compared to e.g. computing a single block of ChaCha20. Is that right in your experience? Thinking about other problems: Does libsodium need to support platforms that haven't implemented C11 standard atomics? Would it be an interrupt-safety issue to take a lock implicitly? What else might I be missing?

jedisct1 commented 4 years ago

Calling sodium_init() is required.

Some functions will not panic if the library hasn't been initialized, but this is only for ABI compatibility with NaCl.

Calling a function when an application starts or when a library is loaded should never be a problem.

Among other things, sodium_init() initializes the PRG. On platforms without kernel support, that means opening /dev/urandom, and this is something we want to do before a chroot() call, not on the first use.

On Linux, getrandom() may also block, possibility for a long time. Again, this is not something we may want to happen at unexpected places. An initialization function is the best place for this. Same for CPU features detection. If hardware support for AES is not available, this is something that applications should discover when they start, not when they try to use it.

With gcc/clang/icc, your library can use a library constructor to initialize itself and its dependencies. You can leverage this to call sodium_init().

oconnor663 commented 4 years ago

With gcc/clang/icc, your library can use a library constructor to initialize itself and its dependencies. You can leverage this to call sodium_init().

That's a good idea, I'll give it a shot. Thanks for the quick response.

oconnor663 commented 4 years ago

Still thinking about this: The considerations you mentioned are a strong argument for supporting explicit initialization, but I don't know if that's the same thing as requiring explicit initialization. Could it be the best of both worlds to allow sodium_init to work as it does today, but to also implicitly call sodium_init the first time we access the global PRNG, architecture-specific function pointers, or other shared state? I think the biggest benefit would be a cleaner story for depending on libosdium in library code, but making libsodium thread-safe in all cases also seems like a safety win.

I wonder how many applications in the wild today are getting this wrong. Do you think conjuring up some data or examples might be useful?

jedisct1 commented 4 years ago

Libraries that don't have an initialization function can still call sodium_init() before every function call.

There are no thread safety issues except on platforms that don't support locks nor atomics. sodium_init() can be called multiple times and from multiple threads.

jedisct1 / libsodium

thinking about alternatives to sodium_init #973