Closed steipete closed 8 years ago
The static initializer design came in 1.11.14 to ditch a previous system which relied on application initialization (that's the currently empty LibraryInitializer in init.h). Before this Botan had a global state object with an RNG, runtime settings, and code implementing the string->T*
operations, but I wanted to remove the global state, and initializing just maps of string->function<T*()>
as Algo_Registry
does seemed safe enough. But, it is not. The first version broke static linking (#52) and the current fix (#279) has some serious deficiencies, for example when using the static library, the OpenSSL version of the block ciphers occur in a file that is not otherwise considered needed by the linker, so the registrations will not be pulled in and the ciphers will be unavailable to applications at runtime, even if they were in libbotan.a. And as you find static constructors happen in any order. I'm pretty sure what Botan itself is doing at static init can be run in any order safely, as all it is doing is creating std::map entries, but interspersed with application code calling the library is another story.
So, the current situation is not good imo, and I have in the past given thought to changing this before 2.0. I think part of the solution will be adding an (idempotent) initialization function register_all_algorithms
which runs everything that is currently running as a static constructor. This could be called from LibraryInitializer
s constructor also; all applications written against a version before 1.11.14 create such an object already. If the constructor is idempotent and the destructor is a no-op that avoids the primary problem I was avoiding by removing the global state, which was the problem of nested initialization (in an application which both uses botan directly and uses a library which uses botan, who initializes? who deinitializes?). In retrospect, using refcounting on the state and just initializing once would probably have worked fine. So it goes.
I'm not completely sure what the fix for this looks like. Brute force seems almost temping - have register_all_algorithms
call the new register_all_hash
, register_all_block
, etc and those functions call what is currently static constructed in hash.cpp, block_cipher.cpp, etc. But that leaves behind OpenSSL again; perhaps a register_all_openssl
? It feels somewhat ugly but is the best I have been able to come up with (one reason I haven't changed it so far; I'm hoping to see a cleaner solution)
Another option that came to me (not fully fleshed out, and it's late here) - move the (already global/shared/static) Algo_Registry<T>
to T itself, say static const Algo_Registry<BlockCipher>& BlockCipher::registry()
. BlockCipher::create
becomes return registry().make(arg)
. Block ciphers can be registered with BlockCipher::registry().add(...)
. On first use of the block cipher registry, all of the known algorithm implementations (already listed in block_cipher.cpp) could be registered. This still doesn't help OpenSSL however.
One remaining problem is that PK operations (the objects that actually do signing, decrypt, etc given a key) are also looked up through Algo_Factory
as this allows multiple implementations (see for example rsa.cpp and openssl_rsa.cpp). I'm not sure what the best approach for those is, as almost of the PK_Ops classes are (and should be) internal; I'd rather not expose them in headers just so a registration function can pull them in.
tl;dr I don't like it either, just not quite sure what the fix is
On further thought I think the fix here is pretty straightforward.
Right now the whole reason for all of this static initialization is to set up strings mapping to std::functions which construct a particular type (say BlockCipher). And this allows adding new cipher, or new implementation of an existing cipher, at runtime. So theoretically all very flexible. But in practice nobody needs to do any of these things. Much of the benefit of the original version of this lookup scheme in 1.11.14 was it avoided having any kind of global per-object list. But then later compromises in order to support static linking brought back exactly that per-T list, with every BlockCipher having to be mentioned in block_cipher.cpp in order for lookup to work, but now with several layers of macros and runtime operations to make it all work.
So, have BlockCipher::create just directly do the lookup and return a new instance of the type, doing CPUID lookups and the like. Conceptually just
#if defined(BOTAN_HAS_AES)
if(name == "AES-128") {
#ifdef (BOTAN_HAS_OPENSSL)
if(provider == "openssl") return new OpenSSL_Cipher(...);
#endif
if(CPUID::has_ssse3() && provider == "ssse3") return new AES_128_SSSE3();
return new AES_128;
}
#endif
and so on times everything.
This prevents new algorithms from being added to this lookup table at runtime. But if someone does have some custom algorithm (or new impl of an existing algorithm) they want to use (say SPECK) they can still subclass BlockCipher and instantiate their types directly; but BlockCipher::create("SPECK") won't work. Some library APIs do only accept strings I think, but versions taking a BlockCipher& could be added where reasonable. And for some of these types the additional flexibility provided is just implausible: how often does someone add a new KDF algorithm at runtime?
Removing Algo_Factory
from the loop will also avoid taking a lock while doing each string->T lookup - reinforcing I think the value provided by this change.
This transition can be done incrementally for each type. Best to start with a simple case like KDF or PBKDF.
Happily none of this impacts public API in any way, as algo_registry.h is internal and so all uses of it are contained in either internal headers or the library itself.
The algo_registry code is also used by the public key code in a somewhat different way. I'm working on this one and hope to have a PR for review this week.
As of 1.11.33 static initializers should trouble you no more. Botan can now safely be called before main starts or even (somewhat dicier) after main returns.
Hi there - first of all, thanks for this wonderful library and the opportunity to ditch OpenSSL. We're using Botan in our project (PSPDFKit) and it mostly works great. One design problem is Botan's use of static initializers for Algorithms.
Botan's static constructor runs very early:
However, we have code that uses
__attribute__((constructor))
which runs even earlier, and we're running into issues where the algorithms can't yet be found in the registry.Is this something you would consider changing? I'm not an expert in C++ yet so I'm not sure what I'd recommend. From doing some research, it seems that static initializers are really not a good idea: http://meowni.ca/posts/static-initializers/
It also seems that relying on ordering is a path for trouble: https://isocpp.org/wiki/faq/ctors#static-init-order
In this document they also mention a few potential alternatives that might work better. I wonder if we're the only ones that have issues with the current approach? I searched the existing tickets and the 2.0 roadmap but couldn't find anything related.
Let me know if I can help.