ARMmbed / core-util

DEPRECATED: Mbed 3 utilities library
Other
12 stars 17 forks source link

Lambdas #44

Open bremoran opened 8 years ago

bremoran commented 8 years ago

This PR enables the use of Functors in general and lambdas in specific. It also reduces function call overhead and builds FunctionPointer in a much more C++ way--using C++ vtables instead of membercaller and the ops structure, which are effectively vtables.

Note that a FunctionPointer to a large functor will induce a malloc. Further copies of said function pointer will cause additional mallocs.

Reducing function call overhead has required some C++11 features: mostly rvalue references and std::forward.

This PR needs to be upgraded with variadic macros. To make that happen the FPFunctor storage needs to be replaced with a std::tuple.

This is full of template metaprogramming, such as std::enable_if and std::is_same some of these could be implemented in other ways if necessary. This approach provides a low run-time overhead.

This implementation has been built to use a minimal number of stack frames. Only two are used for most function calls. Bound function pointers require an additional two stack frames. (4 in total). This is a penalty which is associated with pass-by-value function pointer objects. It could easily be mitigated by function pointer references to pool-allocated function-pointer objects. This would replace one stack frame with an object dereference. I do not expect there to be significant runtime overhead in the additional stack frames; it simply makes debugging less pleasant.

This implementation has been built to use a minimal number of copies of an object. It should only create a single duplicate of any pass-by-value object, which is the expected overhead for a normal function call. It creates two copies for bound function pointers. One copy is stored in the bound arguments; one copy is created when passed to the bound function.

cc @hugovincent @bogdanm @autopulated @0xc0170 @rgrover @niklas-arm @pan-

Background

The design goals of FunctionPointer are:

  1. pass by value
  2. Interrupt safety
  3. on-stack construction
  4. low storage space.

    Pass by Value

In order to satisfy the pass-by-value requirement, function pointers must a) be fixed size, b) contain internal storage c) be a single common type, and d) handle the distinction between function pointer types internally, explicitly.

Fixed Size

mbed 2.0 and core-util < v2.0.0, handle this the same way: they contain internal storage large enough to install a static function pointer and a method pointer.

In this new version, the same approach is taken, except that this can be overridden via a template parameter and a larger function pointer can be installed.

Internal Storage

All implementations take the same approach here: All function pointers contain enough space to store an object pointer and a method pointer. This also covers use-cases of small functors in the new implementation (up to sizeof(void*) * 3).

Single Common Type

In mbed 2.0 and core-util < v2.0.0, this is accomplished by installing a pointer to a static function that actually invokes the function pointer, with the correct interpretation. This is a sort of vtable without actually being a vtable.

This approach introduces a calling overhead of two stack frames. Since arguments are passed by value through this calling apparatus, it also creates one more instance of every non-trivially constructible object. In the core-util < v2.0.0 implementation, it also creates an extra copy of each object, since objects are passed via a structure.

In core-util >= v2.0.0, this changes: FunctionPointer is a single common type, but it is more like a specialized allocator and container class than a function pointer class. Instead of using the pseudo-vtable, the new FunctionPointer constructs a class derived from FunctionPointerInterface into its internal storage. This way, it uses an explicit vtable.

The new FunctionPointer still suffers a 2-frame calling overhead, but since arguments are passed by universal reference, only a single additional instance is required. This extra instance is the same instance that would be generated in a normal function call.

Bound function pointers suffer an extra 2-frame overhead in both scenarios

For the existing FunctionPointer implementation, the overhead is:

FunctionPointer::call()
membercaller()
ActualFunction

For the new FunctionPointer implementation, the overhead is:

FunctionPointer::call()
FunctionPointerInterface::call()
ActualFunction()

Distinguish Between Function Pointer Types

In mbed 2.0 and core-util < v2.0.0, this is done using the pseudo-vtable. In core-util >= v2.0.0, this is done using a real vtable through virtual inheritance from FunctionPointerInterface.

Interrupt Safety

This is accomplished by ensuring that no heap is required. In mbed 2.0 and core-util < v2.0.0, this is done by using fixed internal storage which is always used in the same way.

In this PR, a new approach is introduced: in-place construction via placement new is used to allow conventional C++ construction of objects, but without the use of the heap.

There is a danger for bound function pointers, however. Since it is not possible to know whether an object has an interrupt-safe constructor, it is dangerous to bind non-trivial types from within an interrupt context.

On-Stack Construction

On-Stack Construction requires that a FunctionPointer be allocated entirely on the stack. The internal storage requirement for pass-by-value satisfies this requirement. On-Stack construction permits FunctionPointers to be garbage-collected without extra effort, and safely, in an interrupt context. The same caveat about non-trivial types in bound function pointers still applies.

Low Storage Space

The ultimate goal is that FunctionPointer and the bound version should use an absolute minimum of storage space. The current FunctionPointer implementation is 16 bytes. The new one is also 16 bytes. There was a prototype for an 8-byte function pointer, but the syntax for using it was too ugly.

bogdanm commented 8 years ago
bremoran commented 8 years ago

There are very minimal tests. It's not yet tested with armcc. This is the point at which it started working with gcc and clang.

The next steps are variadic templates, then armcc, then extensive testing.

bremoran commented 8 years ago

Maybe this isn't ready for a PR yet. I can continue development and file another PR later.

bogdanm commented 8 years ago

I admire your C++ craftmanship, but I'm quite a bit worried about the complexity of this implementation. That said, I really don't know how to make it less complex ...

bremoran commented 8 years ago

I'm going to update the PR message with this comment. If the PR goes through, it will become a reference document.

Background

The design goals of FunctionPointer are:

  1. pass by value
  2. Interrupt safety
  3. on-stack construction
  4. low storage space.

Pass by Value

In order to satisfy the pass-by-value requirement, function pointers must a) be fixed size, b) contain internal storage c) be a single common type, and d) handle the distinction between function pointer types internally, explicitly.

Fixed Size

mbed 2.0 and core-util < v2.0.0, handle this the same way: they contain internal storage large enough to install a static function pointer and a method pointer.

In this new version, the same approach is taken, except that this can be overridden via a template parameter and a larger function pointer can be installed.

Internal Storage

All implementations take the same approach here: All function pointers contain enough space to store an object pointer and a method pointer. This also covers use-cases of small functors in the new implementation (up to sizeof(void*) * 3).

Single Common Type

In mbed 2.0 and core-util < v2.0.0, this is accomplished by installing a pointer to a static function that actually invokes the function pointer, with the correct interpretation. This is a sort of vtable without actually being a vtable.

This approach introduces a calling overhead of two stack frames. Since arguments are passed by value through this calling apparatus, it also creates one more instance of every non-trivially constructible object. In the core-util < v2.0.0 implementation, it also creates an extra copy of each object, since objects are passed via a structure.

In core-util >= v2.0.0, this changes: FunctionPointer is a single common type, but it is more like a specialized allocator and container class than a function pointer class. Instead of using the pseudo-vtable, the new FunctionPointer constructs a class derived from FunctionPointerInterface into its internal storage. This way, it uses an explicit vtable.

The new FunctionPointer still suffers a 2-frame calling overhead, but since arguments are passed by universal reference, only a single additional instance is required. This extra instance is the same instance that would be generated in a normal function call.

Bound function pointers suffer an extra 2-frame overhead in both scenarios

For the existing FunctionPointer implementation, the overhead is:

FunctionPointer::call()
membercaller()
ActualFunction

For the new FunctionPointer implementation, the overhead is:

FunctionPointer::call()
FunctionPointerInterface::call()
ActualFunction()

Distinguish Between Function Pointer Types

In mbed 2.0 and core-util < v2.0.0, this is done using the pseudo-vtable. In core-util >= v2.0.0, this is done using a real vtable through virtual inheritance from FunctionPointerInterface.

Interrupt Safety

This is accomplished by ensuring that no heap is required. In mbed 2.0 and core-util < v2.0.0, this is done by using fixed internal storage which is always used in the same way.

In this PR, a new approach is introduced: in-place construction via placement new is used to allow conventional C++ construction of objects, but without the use of the heap.

There is a danger for bound function pointers, however. Since it is not possible to know whether an object has an interrupt-safe constructor, it is dangerous to bind non-trivial types from within an interrupt context.

On-Stack Construction

On-Stack Construction requires that a FunctionPointer be allocated entirely on the stack. The internal storage requirement for pass-by-value satisfies this requirement. On-Stack construction permits FunctionPointers to be garbage-collected without extra effort, and safely, in an interrupt context. The same caveat about non-trivial types in bound function pointers still applies.

Low Storage Space

The ultimate goal is that FunctionPointer and the bound version should use an absolute minimum of storage space. The current FunctionPointer implementation is 16 bytes. The new one is also 16 bytes. There was a prototype for an 8-byte function pointer, but the syntax for using it was too ugly.

Making FunctionPointer simpler

Many of the caveats of FunctionPointer can be removed if we relax some of the requirements:

  1. Pass by Value
  2. On-Stack Construction

If we relax these requirements we can remove the internal storage requirement. We can also remove the single common type requirement and replace it with a single common base requirement. Function pointers no longer need to be fixed-size. Likewise, they don't need to handle differences between function pointer types explicitly; instead, they can handle differences between types implicitly, via virtual inheritance.

Simplicity

If we use references, many things become easier. Much of the apparently duplicate code will disappear. Much of the state management code will disappear (e.g. copy_to). Even bind becomes simpler and bound function pointers will require one less vtable.

Sizes

This is not premature optimization, it is simply a description of merit. I don't expect memory use to be a deciding factor, simply on piece of information which informs the bigger picture.

If we do relax these requirements, the natural result is a FunctionPointer reference object, which links to a reference-counted FunctionPointer. To be interrupt-constructible, the reference-counted FunctionPointer must be pool allocated. However, multiple pools of varying size can be created. This means that FunctionPointers to free or static functions can consume only 8 bytes instead of the current 16. Including a single reference object in this calculation, the reference variant of a free or static FunctionPointer takes only 12 bytes instead of 16. By contrast, the method pointer takes 20 bytes instead of 16.

Many C++ interfaces within mbed OS store a function pointer internally and schedule it at some later point. This causes the function pointer to be constructed both in the MINAR queue, and the host object. In these cases, each method pointer takes 4 bytes more most of the time, and 8 bytes less whenever a call is scheduled.

Call Overhead

Since calls to FunctionPointer references are handled via the v-table, there is no stack frame overhead to calling a FunctionPointer via reference. Instead, there is a single pointer dereference

Construction Overhead

Construction incurs a pool allocation overhead. Copy construction incurs an assignment, and a reference increment. Assignment incurs a reference decrement, an assignment, and a reference increment.