Copyright (C) 2013 Canonical Ltd

This program is free software: you can redistribute it and/or modify

it under the terms of the GNU Lesser General Public License version 3 as

published by the Free Software Foundation.

This program is distributed in the hope that it will be useful,

but WITHOUT ANY WARRANTY; without even the implied warranty of

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the

GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License

along with this program. If not, see http://www.gnu.org/licenses/.

Authored by: Michi Henning michi.henning@canonical.com

This is an explanation of the layout of the source tree, and the conventions used by this library. Please read this before making changes to the source!

For instructions on how to build the code, see the INSTALL file.

Build targets

The build produces the Unity scopes library (libunity-scopes).

TODO: Flesh this out

Source tree layout

At the top level, we have src, include, and test, which contain what they suggest.

Underneath src and include, we have subdirectories that correspond to namespaces, that is, if we have src/internal, that automatically means that there is a namespace unity::scopes::internal. This one-to-one correspondence makes it easier to work out what is defined where.

Namespaces

The library maintains a compilation firewall, strictly separating external API and internal implementation. This reduces the chance of breaking the API, and it makes it clear what parts of the API are for public consumption.

Anything that is not public is implemented in a namespace called "internal".

TODO: More explanation of the namespaces used and for what.

Header file conventions

All header files are underneath include/scopes. Source code always includes headers using the full pathname, for example:

include <scopes/ScopeBase.h>

All source code uses angle brackets (<...>) for #include directives. Double quotes ("...") are never used because the lookup semantics can be surprising if there are any headers with the same name in the tree. (Not that this should happen, but it's better to be safe. If there are no duplicate names, inclusion with angle brackets behaves the same way as inclusion with double quotes.)

All headers that are for public consumption appear in include/scopes/* (provided the path does not include "internal").

Public header directories contain a private header directory called "internal". This directory contains all the headers that are private and specific to the implementation.

No public header file is allowed to include any header from one of the internal directories. (Doing so would break the compilation firewall and also prevent API clients from compiling because the internal headers are not installed when unity is deployed. (This is enforced by the tests.)

All header files, whether public or internal, compile stand-alone, that is, no header relies on another header being included first in order to compile. (This is enforced by the tests.)

Compilation firewall

Public classes use the pimpl (pointer-to-implemtation) idiom, also known as "envelope-letter idiom", "Cheshire Cat", or "Compiler firewall idiom". This makes it less likely that an update to the code will break the ABI.

Many classes that are part of the public API contain only one private data member, namely the pimpl. No other data members (public, protected, or private) are permitted.

For public classes with shallow-copy semantics (classes that are immutable after instantiation), the code uses std::shared_ptr as the pimpl type because std::shared_ptr provides the correct semantics. (See unity::Exception and its derived classes for an example.)

For classes that are non-copyable, std::unique_ptr is used as the pimpl type.

If classes form derivation hierarchies, by convention, the pimpl is a private data member of the root base class. Derived classes can access the pimpl by calling the protected pimpl() method of the root base class. This avoids redundantly storing a separated pimpl to the derived type in each derived class. Instead, polymorphic dispatch to virtual methods in the derived classes is achieved by using a dynamic_cast to the derived type to forward an invocation to the corresponding virtual method in the derived implementation class.

Error handling

No error codes, period. All errors are reported via exceptions. All exceptions thrown by the library are derived from unity::Exception (which is abstract). unity::Exception provides a to_string() method. This method returns the exception history and prints all exceptions that were raised along a particular throw path, even if a low-level exception was caught and translated into a higher level exception. This works for exceptions derived from std::nested_exception. (The exception chaining ends when it encounters an exception that does not derive from std::nested_exception.)

If API clients intercept unity exceptions and rethrow their own exceptions, it is recommended that API clients derive their exceptions from unity::Exception or, alternatively, derive them from std::nested_exception and implement a similar to_string() operation that, if it encounters a unity::Exception while following a chain, calls the to_string() method on unity::Exception. That way, the entire exception history will be returned from to_string().

Functions that throw exceptions should, if at all possible, provide the strong exception guarantee. Otherwise, they must provide the basic (weak) exception guarantee. If it is impossible to maintain even the basic guarantee, the code must call abort() instead of throwing an exception.

Resource management

The code uses the RAII (resource acquisition is initialization) idiom extensively. If you find yourself writing free, delete, Xfree, XCloseResource, or any other kind of clean-up function, your code has a problem. Instead of explictly cleaning up in destructors, immediately assign the resource to a unique_ptr or shared_ptr with a custom deleter.

This guarantees that the resource will be released without having to remember anything. In particular, it guarantees that the resource will be released even if allocated in a constructor near the beginning and something called by the constructor throws before the constructor completes.

For resources that cannot be managed by unique_ptr or shared_ptr (because the allocator does not return a pointer or returns a pointer to an opaque type), use the RAII helper class in ResourcePtr.h. It does much the same thing as unique_ptr, but is more permissive in the types it can manage.

Note the naming convention established by util/DefinesPtrs.h. All classes that return or accept smart pointers derive from DefinesPtrs, which is a code injection template that creates typedefs for Ptr and UPtr (shared_ptr and unique_ptr, respectively) as well as CPtr and UCPtr (shared_ptr and unique_ptr to constant instances).

Ideally, classes are fully initialized by their constructor so it is impossible for a class to exist but not being in a usable state.

For some classes, it is unavoidable to provide a default constructor (for example, if we want to put instances into an array). It is also sometimes impossible to fully construct an instance immediately, for example if the instance is member variable and the necessary initialization data is not available until some time afterwards.

In this case, the default constructor must initialize the class to a fail-safe state, that is, it must behave sanely in the face of methods being invoked on it. This means that calling a method on a default-constructed instance should throw a logic exception to indicate to the caller that the instance is not fully initialized yet.

Note that turning method calls on not-yet-initialized instances into no-ops is usually a bad idea: the caller thinks that everything worked fine when, in fact, it did nothing. If such no-op methods do something sensible (that is, they can do their job even on an incompletely initialized instance), this begs the question of why the instance wasn't default-constructible in the first place...

To sum it up: try to enforce complete initialization from the constructor wherever possible. If it is impossible to do that, follow the principle of least surprise for the caller if a method is called on a not-yet-initialized instance.

Code structure

The code is arranged in three layers:

scopes (top level)

This level contains everything that is part of the public API. No source file in this level includes any header from the internal namespace (or below the internal namespace).

internal

This level contains the implementation logic for the scopes run time, independent of the particular middleware.

zmq_middleware

This level contains the middleware-specific parts of the code. No internal header file (except for MiddlewareFactory) includes any header in zmq_middleware.

Client invocation classes

The call chain for an outgoing invocation is like this (using Scope as an example):

ScopeProxy -> Scope -> ScopeImpl -> MWScopeProxy -> ZmqScope

Clients invoke remote operations via a proxy. The *Proxy types (such as ScopeProxy) are shared_ptrs to the corresponding proxy class (such as Proxy). Each proxy class provides a method that corresponds to a remote operation in the server.

The proxy class stores a pimpl, which bridges the gap into the internal namespace. In the internal namespace, the actual proxy implementation is called *Impl, for example, ScopeImpl.

The *Impl classes contain any additional methods and data members that might be needed by the run time during remote method invocation. In effect, they provide an interception point where we can do things behind the scenes during an invocation (such as sending metrics and transparently forwarding query cancellation.)

Each Impl class provides access to the actual middleware-specific proxy as a shared_ptr to a base class. The accessor for this proxy is called fwd() and returns an instance of MWProxy, for example, MWScopeProxy. This actually is a shared_ptr. In other words, the MWScopeProxy pointer does for our implementation what ScopeProxy does for our API clients: it shields our implementation from the specifics of the middleware that is in use.

MWScope is an abstract base interface. It contains a pure virtual method for each remote operation. The signature of each method is the same as the signature of the corresponding method on the public proxy.

The concrete implementation of the MWScope interface is provided by ZmqScope.h. ZmqScope.h provides the client-side implementation of each remote operation. The job of the implementation is to convert the middleware-independent input parameters into their middleware-specific counterparts, invoke the operation, wait for any return values, and convert the middleware-specific return values into their middleware-independent counterparts. For Zmq, the proxy implementation also drives the capnproto marshaling and unmarshaling.

Server dispatch classes

The call chain for an incoming invocation is like this (using Scope as an example):

ObjectAdapter -> ScopeI -> ScopeObject

On the server side, when a method invocation arrives over the wire, it ends up in an instance of ObjectAdapter. Each network endpoint has one adapter. The job of the adapter is to listen for incoming requests, map each request to the correct C++ target object, and invoke a callback that corresponds to the operation on that target object.

The endpoint to which a request is sent by a client implicitly identifies which adapter handles the incoming request. Each request that arrives over the wire carries an object identity and an operation name. The adapter unmarshals this information and then looks for a C++ object (called a "servant") that was registered for that identity with the middleware to receive requests of a particular type. Each adapter can have multiple C++ servants, each of which handles requests for a particular identity. The different servants can implement objects of different types.

Each adapter contains a table that maps the incoming identity to the corresponding servant. The key to the table is the identity, and the lookup value is a sharedptr to the servant. The table is populated by the add*_object() methods on MiddlewareBase.

If the adapter cannot find a servant with an identity that matches the request, it marshals an exception back to the client. Otherwise, it dispatches the request.

The concrete servants are of type *I, for example, ScopeI. (The "I" is short for Implementation.) The ScopeI class is the middleware-specific server-side equivalent of the ZmqScope client-side proxy. All servants derive from a base ServantBase. ServantBase provides a generic dispatch() method. The job of dispatch() is to check whether the servant can actually handle an incoming operation. For example, if the incoming operation name is "foo", but the servant does not provide a method called "foo", it sends an OperationNotExistException back to the client. Otherwise, dispatch() calls the corresponding method on the servant and waits for the method to complete. The invocation to the servant is made in a try-catch block; if the servant throws an exception, dispatch() takes care of returning an exception to the client.

The concrete servant method called by dispatch_ (such as ScopeI::search) now unmarshals any in-parameters, translates them to the middleware-independent equivalents, forwards the invocation to the middleware-independent servant, and translates and marshals any return values or exceptions.

Each servant class stores a pointer to a delegate. The delegate implementation lives in the internal namespace and does not know by which middleware it is called. The delegates are middleware-independent servants, called Object, for example, ScopeObject. Each Object class implements a method for the middleware-specific servant to call, with parameters that correspond to the operation that was invoked by the client. In other words, the Object classes implement the server-side behavior of the operations that clients call, and the servant classes forward an incoming request to their corresponding Object instance. This allows the middleware-independent part of run time to implement operations without having to include and middleware-specific header files.

If you think of ScopeProxy as a pointer that can point at an object in a remote address space, then ScopeObject is the corresponding server-side target that implements the methods that the client can invoke on the proxy.

Capnproto definitions

The capnproto definitions live in zmq_middleware/capnproto and define what is marshaled over the wire. (As far as zmq is concerned, things that go over the wire are just opaque messages, that is, blobs of bytes. Zmq knows about message boundaries, but knows nothing about what's inside them.)

Message.capnp defines the message headers.

A request contains the request mode (oneway or twoway), the identity of the servant, the operation name, followed by the in-parameters as a blob (a capnproto Object).

A response (sent only for twoway invocations) contains a status that indicates success, run-time exception, or user exception, plus any return values as a blob.

Success means the invocation was successful, and the return values are inside the payload blob. If the invocation failed, it can be for a variety of reasons, some unexpected, some not. Unexpected reasons are things such as general failures during dispatch, failure to locate the identity for a servant or the operation within the servant, and any unexpected exceptions that the operation implementation might throw. Unexpected errors are indicated as a runtimeException. At the moment, there are two special runtime exceptions (ObjectNotExist and OperationNotExist), which indicate failure to find a servant or the correct operation. All other error conditions (such as an operation implementation throwing 42), the client gets an UnknownUserException with an error string.

Expected exceptions are exceptions that an operation is expected to produce. (Think of Java exception specifications.) For example, NotFoundException can be thrown by the Registry::get_metadata() method. To deal with these, the servant implementation (ZmqRegistry) calls get_metadata() on the delegate in a try-catch block that handles NotFoundException separately. If that exception is thrown, ZmqRegistry::get_metadata() populates the payload that is marshaled with the response with the exception details and sets the error status in the response accordingly.

The renaming capnp definitions, such as Scope.capnp, each define the operations and parameters for the corresponding interface. There is a definition for each operation, Request, that lists all the in-parameters and, for twoway methods, and Response (for twoway operations), that contains a union of the return values and exceptions that the operation can raise. (Oneway operations do not marshal anything in the return direction and so do not have any Response definitions.)

Loose ends

Things that need fixing or checking are marked with a TODO comment in the code.

Things that are work-arounds for bugs in the compiler or libraries are marked with a BUGFIX comment.

Things that are dubious in some way with no better solution available are marked with a HACK comment.

Style

Consider running astyle over the code before checking it in. See astyle-config for more details.

Test suite

The test suite lives underneath the test directory.

test/headers contains tests relating to header file integrity.

test/gtest contains the C++ tests (which use Google test).

The Google gtest authors are adamant that it is a bad idea to just link with a pre-installed version of libgtest.a. Therefore, libgtest.a (and libgtest_main.a) are created as part of the build and can be found in test/gtest/libgtest.

See the INSTALL file for how to run the tests, particularly the caveat about "make test" not rebuilding the tests!

Building and installing the code

See the INSTALL file.

Development hints and tricks

See the HACKING file.

ubports / unity-scopes-api

readme