randombit / botan

Cryptography Toolkit
https://botan.randombit.net
BSD 2-Clause "Simplified" License
2.6k stars 570 forks source link

Trouble with static initializers for Algo_Registry #465

Closed steipete closed 8 years ago

steipete commented 8 years ago

Hi there - first of all, thanks for this wonderful library and the opportunity to ditch OpenSSL. We're using Botan in our project (PSPDFKit) and it mostly works great. One design problem is Botan's use of static initializers for Algorithms.

Botan's static constructor runs very early:

(lldb) bt
* thread #1: tid = 0x2d67f9, 0x000000010d15b104 PSPDFKitTests`Botan::Algo_Registry<Botan::BlockCipher>::global_registry() + 4 at botan_all_internal.h:70, queue = 'com.apple.main-thread', stop reason = breakpoint 6.2
    frame #0: 0x000000010d15b104 PSPDFKitTests`Botan::Algo_Registry<Botan::BlockCipher>::global_registry() + 4 at botan_all_internal.h:70
    frame #1: 0x000000010d168244 PSPDFKitTests`Botan::Algo_Registry<Botan::BlockCipher>::Add::Add(this=0x000000010e0c16d8, basename="AES-128", fn=Botan::Algo_Registry<Botan::BlockCipher>::maker_fn @ 0x00007fff5e299470, provider="base", pref='d')>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned char) + 52 at botan_all_internal.h:123
    frame #2: 0x000000010d107bd2 PSPDFKitTests`Botan::Algo_Registry<Botan::BlockCipher>::Add::Add(this=0x000000010e0c16d8, basename="AES-128", fn=<unavailable>, provider="base", pref='d')>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned char) + 50 at botan_all_internal.h:122
  * frame #3: 0x000000010d25e51d PSPDFKitTests`::__cxx_global_var_init.145() + 397 at botan_all.cpp:4695
    frame #4: 0x000000010d261523 PSPDFKitTests`_GLOBAL__sub_I_botan_all.cpp + 19 at botan_all.cpp:0
    frame #5: 0x000000010198e9cf dyld_sim`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 265
    frame #6: 0x000000010198eb48 dyld_sim`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
    frame #7: 0x000000010198a8a8 dyld_sim`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 342
    frame #8: 0x0000000101989bb8 dyld_sim`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 138
    frame #9: 0x0000000101989c4d dyld_sim`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 75
    frame #10: 0x000000010198232e dyld_sim`dyld::runInitializers(ImageLoader*) + 87
    frame #11: 0x0000000101986888 dyld_sim`dlopen + 900
    frame #12: 0x000000010480b7ec libdyld.dylib`dlopen + 59
    frame #13: 0x0000000104296ad9 CoreFoundation`_CFBundleDlfcnLoadBundle + 153
    frame #14: 0x00000001042962a9 CoreFoundation`_CFBundleLoadExecutableAndReturnError + 329
    frame #15: 0x00000001032c54da Foundation`-[NSBundle loadAndReturnError:] + 557
    frame #16: 0x00000001019ebfbe IDEBundleInjection`__XCBundleInjection + 1005
    frame #17: 0x000000010198e9cf dyld_sim`ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 265
    frame #18: 0x000000010198eb48 dyld_sim`ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) + 40
    frame #19: 0x000000010198a8a8 dyld_sim`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 342
    frame #20: 0x0000000101989bb8 dyld_sim`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 138
    frame #21: 0x0000000101989c4d dyld_sim`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 75
    frame #22: 0x000000010197fd31 dyld_sim`dyld::initializeMainExecutable() + 125
    frame #23: 0x0000000101983097 dyld_sim`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 3211
    frame #24: 0x000000010197f251 dyld_sim`start_sim + 136
    frame #25: 0x00007fff6610f724 dyld`dyld::useSimulatorDyld(int, macho_header const*, char const*, int, char const**, char const**, char const**, unsigned long*) + 1737
    frame #26: 0x00007fff6610e05c dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 208
    frame #27: 0x00007fff6610a276 dyld`dyldbootstrap::start(macho_header const*, int, char const**, long, macho_header const*, unsigned long*) + 512
    frame #28: 0x00007fff6610a036 dyld`_dyld_start + 54

However, we have code that uses __attribute__((constructor)) which runs even earlier, and we're running into issues where the algorithms can't yet be found in the registry.

Is this something you would consider changing? I'm not an expert in C++ yet so I'm not sure what I'd recommend. From doing some research, it seems that static initializers are really not a good idea: http://meowni.ca/posts/static-initializers/

It also seems that relying on ordering is a path for trouble: https://isocpp.org/wiki/faq/ctors#static-init-order

In this document they also mention a few potential alternatives that might work better. I wonder if we're the only ones that have issues with the current approach? I searched the existing tickets and the 2.0 roadmap but couldn't find anything related.

Let me know if I can help.

randombit commented 8 years ago

The static initializer design came in 1.11.14 to ditch a previous system which relied on application initialization (that's the currently empty LibraryInitializer in init.h). Before this Botan had a global state object with an RNG, runtime settings, and code implementing the string->T* operations, but I wanted to remove the global state, and initializing just maps of string->function<T*()> as Algo_Registry does seemed safe enough. But, it is not. The first version broke static linking (#52) and the current fix (#279) has some serious deficiencies, for example when using the static library, the OpenSSL version of the block ciphers occur in a file that is not otherwise considered needed by the linker, so the registrations will not be pulled in and the ciphers will be unavailable to applications at runtime, even if they were in libbotan.a. And as you find static constructors happen in any order. I'm pretty sure what Botan itself is doing at static init can be run in any order safely, as all it is doing is creating std::map entries, but interspersed with application code calling the library is another story.

So, the current situation is not good imo, and I have in the past given thought to changing this before 2.0. I think part of the solution will be adding an (idempotent) initialization function register_all_algorithms which runs everything that is currently running as a static constructor. This could be called from LibraryInitializers constructor also; all applications written against a version before 1.11.14 create such an object already. If the constructor is idempotent and the destructor is a no-op that avoids the primary problem I was avoiding by removing the global state, which was the problem of nested initialization (in an application which both uses botan directly and uses a library which uses botan, who initializes? who deinitializes?). In retrospect, using refcounting on the state and just initializing once would probably have worked fine. So it goes.

I'm not completely sure what the fix for this looks like. Brute force seems almost temping - have register_all_algorithms call the new register_all_hash, register_all_block, etc and those functions call what is currently static constructed in hash.cpp, block_cipher.cpp, etc. But that leaves behind OpenSSL again; perhaps a register_all_openssl? It feels somewhat ugly but is the best I have been able to come up with (one reason I haven't changed it so far; I'm hoping to see a cleaner solution)

Another option that came to me (not fully fleshed out, and it's late here) - move the (already global/shared/static) Algo_Registry<T> to T itself, say static const Algo_Registry<BlockCipher>& BlockCipher::registry(). BlockCipher::create becomes return registry().make(arg). Block ciphers can be registered with BlockCipher::registry().add(...). On first use of the block cipher registry, all of the known algorithm implementations (already listed in block_cipher.cpp) could be registered. This still doesn't help OpenSSL however.

One remaining problem is that PK operations (the objects that actually do signing, decrypt, etc given a key) are also looked up through Algo_Factory as this allows multiple implementations (see for example rsa.cpp and openssl_rsa.cpp). I'm not sure what the best approach for those is, as almost of the PK_Ops classes are (and should be) internal; I'd rather not expose them in headers just so a registration function can pull them in.

tl;dr I don't like it either, just not quite sure what the fix is

randombit commented 8 years ago

On further thought I think the fix here is pretty straightforward.

Right now the whole reason for all of this static initialization is to set up strings mapping to std::functions which construct a particular type (say BlockCipher). And this allows adding new cipher, or new implementation of an existing cipher, at runtime. So theoretically all very flexible. But in practice nobody needs to do any of these things. Much of the benefit of the original version of this lookup scheme in 1.11.14 was it avoided having any kind of global per-object list. But then later compromises in order to support static linking brought back exactly that per-T list, with every BlockCipher having to be mentioned in block_cipher.cpp in order for lookup to work, but now with several layers of macros and runtime operations to make it all work.

So, have BlockCipher::create just directly do the lookup and return a new instance of the type, doing CPUID lookups and the like. Conceptually just

#if defined(BOTAN_HAS_AES)
if(name == "AES-128") {
#ifdef (BOTAN_HAS_OPENSSL)
 if(provider == "openssl") return new OpenSSL_Cipher(...);
#endif
 if(CPUID::has_ssse3() && provider == "ssse3") return new AES_128_SSSE3();
 return new AES_128;
}
#endif

and so on times everything.

This prevents new algorithms from being added to this lookup table at runtime. But if someone does have some custom algorithm (or new impl of an existing algorithm) they want to use (say SPECK) they can still subclass BlockCipher and instantiate their types directly; but BlockCipher::create("SPECK") won't work. Some library APIs do only accept strings I think, but versions taking a BlockCipher& could be added where reasonable. And for some of these types the additional flexibility provided is just implausible: how often does someone add a new KDF algorithm at runtime?

Removing Algo_Factory from the loop will also avoid taking a lock while doing each string->T lookup - reinforcing I think the value provided by this change.

This transition can be done incrementally for each type. Best to start with a simple case like KDF or PBKDF.

Happily none of this impacts public API in any way, as algo_registry.h is internal and so all uses of it are contained in either internal headers or the library itself.

The algo_registry code is also used by the public key code in a somewhat different way. I'm working on this one and hope to have a PR for review this week.

randombit commented 8 years ago

As of 1.11.33 static initializers should trouble you no more. Botan can now safely be called before main starts or even (somewhat dicier) after main returns.