react-native-community / discussions-and-proposals

Discussions and proposals related to the main React Native project
https://reactnative.dev
1.68k stars 126 forks source link

C++ ABI stability Guidelines #257

Open shergin opened 4 years ago

shergin commented 4 years ago

Introduction

Some time ago we sat down and discussed ABI stability issues with a group of Facebook engineers who care about the topic. As a result, we assembled a document that discusses the problem area, trade-offs, and possible approaches.

We plan to use it as a basic guideline that C++ library (primarily client-side and mobile) developers might use to make a thoughtful decision to support ABI stability (or not). We also want to share the document with our partners and community to be open and transparent about our reasoning and values in this regard.

This is the first version of a quite technical document, I would like to get any comments or criticisms.

The document

C++ ABI stability for library developers

What is ABI

An ABI is similar to an Application Programming Interface but for machine code.

A useful analogy for this is comparing ABI with a kind of imaginary network protocol that defines how a binary structure of a function caller and callee communicates (just as a server and a client).

Technically speaking, an ABI is a low-level, hardware-dependent format that defines how data structures or computational routines are accessed in machine code. So, an ABI defines how the high-level constructs like function arguments or data structures are represented (e.g. via CPU registers or stack-allocated memory) in machine code to perform a function call.

Concretely, an ABI defines how to exactly put parameters of a function in memory or in registers and how to reinterpret the memory composed of those logical parameters (e.g. how the order of the fields in the struct is related to the order of bytes in memory, how they are aligned, padded, etc).

When a C++ library gets compiled from sources together with an application that relies on that, an ABI is naturally stable because all the functions inside the library and the application share the same compilation environment. Things start to get challenging when the library and the application are compiled separately and then linked together.

Here is a non-exhaustive list of things that might affect ABI (from the C++ perspective):

When some of those limitations are unsolvable (like CPU arch), some libraries use special techniques to overcome the rest so that they are linkable with anything else in most scenarios; this feature is called "ABI stability".

It's not always easy to automatically detect (e.g. via compilation error) or even observe an ABI break. ABI issues might trigger linking problems, instant crashes, crashes that happen only in prod (at scale), or cause the library to behave differently making the application produce incorrect results.

When ABI stability matters for libraries

If one of those scenarios looks plausible for your library, you should consider investing in providing ABI stability.

  1. New versions of your library have to be linkable with other apps originally compiled with older versions of your library. In some cases, it's inconvenient to rebuild the whole app just to test it with a new version of a library. It can take several hours to recompile everything instead of less than a minute for compiling the only changed parts. Slowing down the developer iteration cycle can be disruptive to some projects especially in cases where the app is needed to be recompiled with many versions of the library regularly (e.g. for different platforms).
  2. Your library needs to be distributed (and updated) independently from apps based on it. In some cases, it's just impossible or too dangerous to distribute a library with all apps that use it on a particular machine. That mostly happens because of two reasons:
    1. It would be too wasteful to deliver, store, load and initialize almost the same low-level library (e.g. networking, SSL, image decoding) for all apps that use it on a machine. (Imagine if all iOS apps would have a built-in on-screen keyboard inside their binaries.)
    2. It would make it impossible to apply security patches to such libraries. The situation when fixing something like Heartbleed would require updating all apps on the platform. It would be a nightmare.
  3. Your library must not impose language-specific limitations on other apps using your library. A library might introduce some not so obvious limitations to the code of the apps. For example, if your library exposes interfaces that rely on a more modern version of C++ than the application uses, it might not even compile. In the opposite case, if an app uses a more modern version of the language than the library, it also might not compile (but it’s rare).

Approaches

If your library will benefit from ABI stability, then we should talk about concrete approaches to get the stability.

Accidental ABI stability (or maintaining status-quo)

Most libraries were not designed to be ABI-stable from day one because initially there was no need for that. Over time some library maintainers are finding themselves in a situation when they suddenly have to provide ABI stability. This might happen because of new use-cases (and constraints) or just because consumers are already relying on the stability mistakenly assuming that those guarantees always existed. It's a tough situation to be in. From that point, it practically means that only very small extensions to existing API are allowed. Pretty much everything besides adding additional non-virtual methods will break an ABI to some extent. Sometimes even changing implementation of some methods is not safe. Here are just some classes of changes which might break ABI:

All that boils down to the fact that supporting accidental ABI-stability for a library is possible but only with pretty much no changes in it. Eventually, with a big enough consumer base, every single internal implementation detail will be relied on by some external code, which means it could not be changed anymore. This observation is known as informal Hyrum's Law.

Planned ABI stability

The most reliable and flexible way to provide ABI stability is to deliberately design the public API of a library to avoid all previously discussed pitfalls. All this condenses to a few main principles:

Concrete approaches for Planned ABI stability

Using plain C exclusively for public API

The first and most simple approach to achieve ABI stability is to just formalize all public APIs as a set of plain C functions. Plain C ABI practically never changes language-wise (within the same platform) and it's a foundation of all other ABIs for other languages and libraries. So it works.

To make the library ABI-stable these things need to be ensured:

ABI-stable API with idiomatic C++ wrappers

An obvious downside of the previously described approach is that it does not use idiomatic C++. This leads to poor ergonomics and a lack of safety which modern C++ provides. In many cases, this problem can be mitigated by building some header-only, compiled-away, C++ abstractions on top of the plain C APIs (that wrap it back to C++). This way, the ABI-safety is achieved because the ABI-unsafe code does not change with the library upgrade (because it's not being distributed in a compiled form).

This model has a few caveats though:

Using a subset of C++ to build a dynamic invocation and reference counting interfaces

In some cases, it's reasonable to build very dynamic interfaces that naturally support backward-compatible changes. In those cases, it's safe to use some basic C++ features that never change ABI-wise. In this model, every new version of the interface is a completely new interface with a unique id (e.g. GUID) which has to be queried from the basic interface (e.g. IUnknown) before it can be used aka dynamic conformance checking. The most popular examples of this approach are Microsoft's IUnknown/COM.

This model is also ideologically similar to Objective-C which also heavily relies on message passing and dynamic interface querying. Objective-C does not provide ABI-safety out of the box, but ABI issues in Objective-C world are rare and easy to workaround.

Trade-offs

Building and maintaining an ABI-stable interface for a library is a challenging task requiring specific expertise and additional time. It's an extremely expensive effort. Therefore any team considering that has to weigh all trade-offs before making the decision to invest into it.

Considering all of the benefits that ABI-stability gives (which are different for different projects), a team will need to balance it against some downsides.

It may be a good idea for a team to work for some time on a small scoped part of the interface trying to maintain that ABI-stable and see how feasible and expensive it is.

Conclusion

Supporting ABI-stability gives a lot of flexibility for customers but comes at a huge cost for the library developers. In some cases, however, libraries simply must provide that. If your library needs to be ABI-stable, embrace the importance of it, accept the price and time commitment, and go for it! If you are not sure, you probably don't need it.

References

shergin commented 4 years ago

cc @tudorms, @vmoroz