randombit / botan

Cryptography Toolkit
https://botan.randombit.net
BSD 2-Clause "Simplified" License
2.58k stars 567 forks source link

Proposed changes for use in regulated applications: request for comments #486

Open DanielTBrown opened 8 years ago

DanielTBrown commented 8 years ago

Botan is a wonderful tool for encrypted communications. I am integrating botan as a TLS library into a new product. The software is subject to DHHS HIPAA privacy regulations, and in some uses may be subject to FDA regulations. The software is embedded, there is no user interface. I have run into several issues, and I would appreciate any comments about my proposed solutions:

Compilation warnings

The software must compile without warnings. The existing botan build suppresses Microsoft Visual C++ compiler warnings C4250, C4251, C4267, and C4275. I compiled with all warnings enabled:

Regarding C4267, I submitted pull request #484 to fix about 70 places the source code used implicit casting to narrow a value (e.g. assigning a size_t to a uint16). I made the casting explicit, eliminating the compiler warning messages. This should not change the generated code.

Regarding C4250, I have not fixed this yet. The warning occurs because of specific inheritance in a few C++ classes:

DL_Scheme_PrivateKey DH_PrivateKey EC_PrivateKey ECDH_PrivateKey ECDSA_PrivateKey IF_Scheme_PrivateKey RSA_PrivateKey

I assume that all of these classes have the same problem, and that with some investigation, I can find a simple fix. Any guidance would be appreciated.

Regarding C4251 and C4275, these warnings appear to affect only DLLs, and do not occur when compiling for static linking. So, if configure.py is called with --disable-shared, the warnings can be left enabled. I do not see how to control the /wd4251 and /wd4275 compiler options in configure.py, should I use a '#ifdef' and '#pragma warning' in the C++ source code?

Some background

The software uses only the TLS component of botan. The targets are: Microsoft Windows 64-bit (a vestigial target, only for equipment for which no Linux drivers exist), Linux 64-bit (the primary target), and if possible, an embedded ARM microcontroller (for future battery-operated equipment).

I invoke botan configure.py with '--gen-amalgamation', and '--disable-shared'. Then I compile and link botan as part of the project. The software is statically linked, in order to control the runtime configuration.

The software must conform to USA DHHS HIPAA regulations, which govern personal identifying information. The concept of 'personal identifying information' is interpreted broadly. Even an IP address is considered personal identifying information. A typical penalty for an unintentional first-time violation is apparently US$ 50,000 followed by several years of mandatory monitoring.

The software may be used in cases in which it is subject to USA FDA regulations regarding medical instrument software. I have written the software to conform to those regulations. The FDA has few specific requirements about how the software is written, instead the FDA requires that the software be created according to a written plan, and that this plan address software validation and verification. The verification process must show that the software does what is intended. The verification process must address updates, so each update must be verified as well.

The FDA potential penalties are harsher than those of the DHHS. When a medical instrument is submitted to the FDA for clearance, somebody signs a piece of paper that says they may go to prison if any of the information is not true.

In addition to US government regulations, there are also standards governing medical instruments, for example, IEC 60601. Among many other requirements, IEC 60601 specifies the maximum interval from when an instrument senses an alarm condition until when that condition must be reported to the operator.

I try to meet these various requirements 'by design', to the extent possible. Consider error message logging. It is easy to accidentally write personal identifying information in a log file. In that case, the log file must be treated as containing patient data, resulting in many complications. If an error log contains dynamically-generated strings, the software verification procedure might be: have a knowledgeable person review every place in the source code where an error message is generated, to ensure that the information logged cannot be used to identify a patient, according to the HIPAA standards in place at the time of release. This is very expensive.

A better approach is to log only static literal text strings, because in that case it is impossible to log information that identifies a patient. I recognize that static literal text strings are not as useful as messages containing specific information, but imagine the following scenario: a programmer logs an error message designed to catch client software that generates faulty requests: 'invalid URL on HTTP GET request (IPv4 nnn.nnn.nnn.nnn)'. If this code ships in an instrument, it would be easy to have an incident that requires reporting to the DHHS, followed by an instrument recall, a fine, and several years of monitoring. The message 'invalid URL on HTTP GET request' is less useful when trying to track down the source of the problem, but there is no question which I would pick.

One more point: as the software provider, I am responsible for the complete behavior of the software, including the behavior of the C++ runtime library that I use. My responsibility includes more than just the source code that I write. As an example of the implications, I statically link the executable image with the various runtime libraries, so each shipped version of the software has a fixed configuration that cannot by altered by someone swapping out a DLL or so file.

Executable image size

On Microsoft Windows, the botan library increases the size of the executable image of my program from approximately 200k bytes to more than 900k bytes. This is strange, because the source code of the rest of the program is far larger than the botan source code. The program itself is non-trivial: it provides multiple forms of high-performance real-time communication, high-speed image acquisition, processing, and communication, as well as real-time instrument control. However, including botan increases the size of the software by several times.

The link map seems to show that botan is unnecessarily pulling in a large amount of the C++ runtime library, including many functions related to I/O and threading. I suspect, but have not yet proved, that the C++ runtime library is being pulled in by iostream operations.

This size is of great concern: my microcontroller target has a total memory of 1-2M bytes, and has a very limited C++ runtime library that does not include iostream (for obvious reasons). Also, I hope to avoid verifying operation of the C++ runtime library, and I hope to avoid proving that internal data cannot be re-routed to mass storage by swapping a virtual function pointer.

On inspection, I see that botan uses iostream and related classes almost exclusively to generate text strings by concatenation. I could easily replace these operations with simple string operations (std::string), and eliminate the use of iostream and related classes. It appears that with at most a few days work, the excess runtime library calls could be eliminated, reducing the size considerably.

Error messages

For error messages, my software writes only static literal text strings, never dynamic strings. The reasons are explained above. However, I do not believe that other botan users would appreciate me changing the botan Exception messages to static text strings.

A solution is to conditionally define the botan Exception base class so it does not store the text string. The C++ compiler appears to properly optimize away the code that generates the dynamic text strings, because the software executable image is approximately 12k bytes smaller with this change.

I am not certain of the best way to handle the different definition of the botan Exception class.

Operating system and runtime library calls

The botan source code contains some operating system calls and some calls to the C++ runtime library that are problematic for me. For example, the 'mutex' calls use the C++ runtime library mutex handling, which appears to bring in C++ runtime library threading code. That in turn would require me to verify for each release that the software meets the IEC 60601 alarm condition reporting time requirements.

The best solution - which would probably take some work - would be to define a new operating system type, 'none' (or more descriptively, 'bare metal'), in which case botan might use application-provided calls. Does this sound like a reasonable approach, or is there a better one that I am missing?

Thank you for any assistance.

randombit commented 8 years ago

Thank you for the clear problem statement.

Warnings

Running cleanly under all compiler warnings and static analyzers is our target state. I think #484 should be no problem, sorry on the delay getting it reviewed.

I have never been clear on what exactly C4250 meant or how to fix it. These classes use multiple (and virtual) inheritance, which is the trigger for the warning. Splitting up and simplifying this hierarchy would make sense, but is a non-trivial change with some API implications. Outside of that I don't know if there is any other way to avoid the warning.

C4251 seems unavoidable in a DLL build - nearly all of Botan's exported types use STL types (vector, string), and that's that. An API specifically designed to be safe to call from a Windows DLL is a different beast entirely - straight C API, no passing memory ownership back and forth across the library interface. The FFI interface was designed to be safe to call from ctypes like libraries and has (some of) the same constraints.

The warnings are disabled via explicit setting in src/build-data/cc/msvc.txt - either edit that file, or use a #pragma

Iostreams

In general Botan is developed on relatively large systems (ie, there is an MMU, an OS scheduler, and at least a few megs of RAM). Supporting a microcontroller such as the one you describe (ARM, 1-2 Mb ROM, RTOS/No OS?) is definitely pushing the limits of where Botan has gone, though I think it's completely viable. I think it would be great to better support such targets, but just so you're aware this part may be bumpy.

(At the level of an 8-bit AVR with 8KB ROM a completely different approach is required for which Botan probably isn't suited.)

For error message generation I'm fine with replacing stringstreams with string calls. Any uses of ifstream, ofstream in APIs would have to be guarded by a build-time macro check (BOTAN_HAS_IOSTREAMS, I guess). In most cases a function which uses file objects should be compiled out completely when BOTAN_HAS_IOSTREAMS is not defined, rather than throwing a 'not supported' exception. For example DataSource_Stream (and transitively, any functions which instantiate DataSource_Stream) should be compiled out completely for such a build. This will go up a ways, for instance there is a constructor of X509_Certificate which takes a filename, which passes it to X509_Object which opens a stream. Removing the API endpoints completely has the benefit of making it clear at application build time what's not available, and removes some otherwise dead code from an iostreams-free build.

One change involved in this that I know will be more complicated: read_cfg (in utils/read_cfg.cpp) uses istream and getline to read lines from a configuration file. This is used to parse the KV list of OIDs (oids.cpp and default.cpp) at runtime in order to map names back and forth with OID values. It would be better to do this once at compile time, either using a C++ DSL or a Python codegen script.

Exceptions

I think the Exception messages as used by the library would be quite unlikely to reveal any PII, but a $50,000 fine does have a way of making one cautious. And the smaller build size will be helpful for microcontroller, so allowing this as a conditional compilation option seems fine.

Off the top of my head I don't see any problem with

class BOTAN_DLL Exception : public std::exception
   {
   public:
#if defined(BOTAN_EXCEPTION_NO_MESSAGE)
      explicit Exception(const std::string& unused) {}
      Exception(const char* prefix_unused, const std::string& msg_unused) {}       const
      char* what() const BOTAN_NOEXCEPT override { return "Messages disabled"; }
#else
   // current impl
   private:
      std::string m_msg;
#endif
   };
Threading

There is some use of std::async in the existing tree; many of these, such as the ones in the public key functions, should be removed. They do offer some performance advantage on multicore, but starting a new thread per RSA computation is quite expensive and the best std::async can do. It would be better to re-serialize the operations for the normal case, then offer a (compile-time) alternative parallel mode using a lighter parallel mechanism like Cilk or OMP for multiprocessor systems. A couple of calls for async are used for IO overlapping. I'd have to look at those more carefully.

For the mutex types, would it work to just have a botan_mutex typedef which implements the basic lock interface, and use this instead of std::mutex in the library? Then at build time it would be possible to switch to an OS interface directly, or even be no-op, if the build is known to be single threaded. This would allow options like

#if defined(BOTAN_IS_SINGLE_THREADED)
   void noop_mutex {
      void lock() {}
      void unlock() {}
    };
    typedef noop_mutex botan_mutex;
#elif (IS_WINDOWS)
   typedef Win32CS_mutex botan_mutex;
#elif USE_ATOMIC_SPINLOCK
   // simple spinlock using atomics
#else
   typedef std::mutex botan_mutex; // default
#endif

Something like this is already done in algo_registry.h for Windows builds. MSVC 2013 has a bug regarding use of std::mutex during static initialization, which is avoided by using critical sections instead.

Hopefully it is ok to still use the simple std::lock_guard template without pulling in anything at runtime.

Hope this helps. I know this isn't a 100% blueprint but feel free to open discussion issues on particular parts or open WIP PRs for review.

DanielTBrown commented 8 years ago

Thank you for the detailed response. This looks very good. I have some comments, some additional background, and some additional proposed changes.

Comments

The Microsoft Visual C++ warnings C4251 and C4275 do not arise in my case, because I am statically linking botan into my executable image. I am not certain why someone would build a security library as a DLL. So, I will probably move the compiler flags into the build to make them visible, removing the #pragma.

The Microsoft Visual C++ warning C4250 (related to multiple inheritance with virtual functions) will take some work to eliminate. However, this work may be useful. I checked a few classes, just to get an idea of what might be involved. I found:

As an example of a problem with a simple solution, the function load_check is declared virtual in the base Public_Key class, and overridden in Private_Key. Those are also the only two implementations. However, the function is called only on classes derived from Private_Key, so the function does not have to be virtual, and does not have to exist in Public_Key.

Many of the warnings are reported due to inheritance derived from Public_Key and Private_Key. However, Private_Key is not really necessary: the class contains no additional data, only a few functions. If those functions could be moved into Public_Key, then multiple inheritance involving Private_Key would disappear. Perhaps a 'Base_Key' class could replace them. You can see what is going on here: http://botan.randombit.net/doxygen/classBotan_1_1Public__Key.html

The class TPM_PrivateKey causes some problems, but I do not see that this class is ever used.

I suspect that eliminating the warning C4250 will improve the code. I am happy to do this, and you can see what you think of my proposed changes.

Additional Background

You are correct that my embedded target is a 32-bit ARM processor, nothing smaller. A 32-bit microcontroller with ~1MB flash and ~200kB SRAM, including hardware floating point, costs ~$10. Memory sizes and speeds have been increasing slowly, but power consumption has been dropping rapidly. For example, the STM32L4 series consumes ~1uA operating current while stopped with SRAM preserved. You could have a botan T-shirt that includes botan running on an embedded microcontroller.

My initial target remains a Linux system on x86-64 or ARM, so botan is a good fit. The software is designed to run later on a bare metal microcontroller using lwIP for TCP/IP. I would like to use botan on that target.

I have run into some additional issues, here is some background:

Human evolution is slow, a product that works well today will probably work well a generation from now. Product updates are required when components become obsolete or regulations change. It is frustrating to lose a profitable product because the development software is no longer available. That has happened to me several times. For C/C++, a suitable compiler will be available for decades. However, the language runtime is a major problem, so I avoid using it.

The language runtime may make operating system calls (e.g. std::mutex), so the runtime library must support the target operating system. You can see the problem: when updating a product, I may need a current compiler (to support the target processor), requiring the associated runtime library, which may not support the (then obsolete) operating system. Worse, on an RTOS or bare metal, the C++ runtime library is usually limited or non-existent. So, I call the underlying operating system directly rather than using the language runtime library. The operating system calls are usually well-documented, well-maintained, have few bugs, and may offer important features not included in the runtime library.

There is also a more bizarre problem you may find entertaining: I may have to substitute my own code for operating system calls. Consider the time of day, a common function. An application can contact a network server to synchronize an application time-of-day clock. This avoids calling the operating system to obtain the time of day, which is a good thing.

The operating system calls are based on a clock that runs on battery power when the computer is turned off. If you want to see the headaches that building in a coin cell adds to a product, look at this document from Panasonic: https://na.industrial.panasonic.com/sites/default/pidsa/files/panasonic_lithiumcr_s_info.pdf Notice the special State of California warnings required, and the US DOT packaging requirements. So, I probably ignore the operating system time-of-day calls, and implement my own time of day handling based on getting the network time. I will have to work out a way to collect the botan operating system interface.

As you can see, building products like this is a matter of getting the details right - probably very similar to working on an encryption library.

Additional Changes

There are several important changes:

The callbacks (e.g. output_fn, data_cb) lack a context value (e.g. a 'void*'). I need some way to obtain a context value for dynamically-created objects, std::bind is not adequate. I know this means an API change. Since all the callbacks are associated with a channel, I suggest defining a class containing the callbacks as virtual functions. The calling application then passes a pointer/reference to an object of a derived class to the client/server constructors.

An alternative - which could be implemented without an API change - would be to make TLS::Client and TLS::Server derived classes of TLS::ClientX and TLS::ServerX respectively (names need improvement). The '...X' classes implement the callbacks through a callback object (as described above), and the TLS::Client and TLS::Server classes then use a callback object. Something like:

class ChannelCallback { virtual void output_fn(...) = 0; ... }

class ChannelCallbackX { void output_fn(...) {m_out(...); } override; ... }

class Client: public ClientX { ChannelCallbackX m_callbacks; }

I have to look into memory allocation. The problem is, on an active instrument, the network must never cause the instrument to fail. Almost all regulated applications require me to prove that the device will continue to work, regardless of what happens on the network.

When I open a TCP/IP connection, I allocate all the buffers that may be needed. The same when I send or receive a message. The software has limits on the number of connections and the number of messages. I can then prove that the device will never run out of memory, given a specific memory configuration and specific limits. The situation is complicated by possible memory fragmentation, but you can see the idea. The device might refuse a connection, but that situation is easy to analyze, much simpler than predicting what happens when the network handling consumes all available memory.

Several underlying botan classes use dynamic allocation during operation (e.g. BigInt). I need to work out a way to deal with this. The ARM mbedTLS library uses built-in limits, for example, it limits the size of an equivalent of a BigInt. I will need a solution for this problem before deploying on a microcontroller, or considering the software for an FDA-cleared instrument, but I do not need a solution yet.

There are several changes I suggest making, but these are not necessary:

It would make sense to replace the botan-specific types with the equivalent C++ standard types, for example, use uint32_t instead of u32bit. This can be done with simple global search-and-replace, and should have no code impact. It would allow the 'types.h' module to be eliminated. The advantage is that the botan source code would then use standard C++ types consistently.

I suggest using the C++ types 'intptr_t' and 'uintptr_t' instead of 'word'. 'intptr_t' is what an C/C++ 'int' should have been: the natural size of an integer on the machine.

Next Step

If you think I am heading in a reasonable direction, I will start implementing these changes in pull requests, so you can see if they are reasonable.

Thank you again for your help.

randombit commented 8 years ago

TLS Callbacks: Your approach of an object with all the callbacks was independently invented for the purpose of reducing API complexity in PR #457 which is currently under review. I do like this change to the API, so being able to attach additional context by deriving from a TLS Callback type is likely to be available in a future release.

Removing Private_Key: The current inheritance hierarchy in the public key types is very messy and by far the most complicated use of derivation in the library. However I'm not sure that removing Private_Key outright is the right fix for this issue, or the right API choice. Allowing an application to distinguish at a type level between public and private keys seems useful. I will think about this some more, and review how different APIs approach this.

Part of the problem is the scheme-specific classes (DL_Scheme_PublicKey, IF_Scheme_PublicKey, EC_PublicKey also derive from Public_Key and Private_Key) which leads to this mess https://botan.randombit.net/doxygen/classBotan_1_1RSA__PrivateKey.html

I will look at load_key - it does sound like this can be removed.

Int Typedefs: The current typedefs for integers is a relic of C++98 not having support for stdint.h. Much of the code written more recently used the standard names already, it would be good to standardize on this. Currently there are some large PRs still under review so the timing for this change is not good, but I will make this change

This does not allow removing types.h or the typedefs immediately as both are part of the public API. Removing the Botan::byte and Botan::u32bit typedefs would likely break all current code using the library. But they could be documented as only being for compatibility, and not used anyway within the library itself.

I see your point with uintptr_t, but for builds without special assembly support we rely on having a dword type which is twice the size of word in order to perform multiplication. So for example on a 64-bit SPARC v9 (which has 64 bit addressing but no 64x64->128 bit multiply operation),uintptr_t is 64 bits but word must be 32.

Time: There are quite a few uses of std::chrono::system_clock currently.

On a quick check, many uses of the chrono system clock boil down to just getting seconds since epoch and doing some basic duration calculations, or are reading the system clock to get a time value which really should be an argument to the function (example of this in X509_CA::make_crl). In most cases a uint64_t with seconds since epoch probably is just as clean as the std::chrono code - we're not doing anything complicated.

It's also turned out that std::chrono is not universally well implemented, for example on some versions of MSVC the high_resolution_clock is anything but.

The code that just needs time since epoch for local duration checks (for example TLS::Session) can call a new function in os_utils.h uint64_t get_system_timestamp which returns time since epoch in seconds. On normal platforms this can probably just be time(nullptr), and it creates a single entry point to allow overriding the behavior in other environments.

The PBKDF algorithms have a mode which allows, instead of requesting a specific number of iterations, to hash for the given amount of time (in milliseconds) and returning both the key and the iterations used. The PBKDF code currently uses std::chrono::high_resolution_clock for this. It could use a function from os_utils.h instead.

Memory Usage: Even outside the embedded space there are good reasons to have predicable memory usage bounds. Some code, such as the TLS handshake, does quite a few heap allocations which are avoidable with a bit of work, improving performance on all platforms. But fully quantifiable memory bounds all the way through the TLS stack (so also including RSA, ASN.1 parsing, X.509 path validation among others) is not going to be trivial.

Trafo commented 5 years ago

Their are still warnings. For example warning C4250 is still not fixed.

randombit commented 5 years ago

@Trafo indeed. Patches are welcome. It is pretty easy for MSVC to regress on warnings because I don't use it. So at best once in a while I scan over the CI log and make changes that I think will resolve the issue. GCC and Clang warnings I actually see during development, which is why there are none.