Closed pavel-kirienko closed 5 years ago
Totals | |
---|---|
Change from base Build 194: | 0.0% |
Covered Lines: | 2652 |
Relevant Lines: | 2652 |
Totals | |
---|---|
Change from base Build 194: | 0.0% |
Covered Lines: | 2652 |
Relevant Lines: | 2652 |
Totals | |
---|---|
Change from base Build 194: | 0.0% |
Covered Lines: | 2657 |
Relevant Lines: | 2657 |
In Libuavcan, there is a huge chunk of heavily templated code responsible for data serialization, which upon monomorphization resolves into a series of invocations of low-level bit-level data copy routine: https://github.com/UAVCAN/libuavcan/blob/fd8ba19bc9c09c05a1ab60289b3e7158810e9bd0/libuavcan/src/marshal/uc_bit_array_copy.cpp#L12-L58. In Libcanard, there are basic functions
canardEncodePrimitive(..)
andcanardDecodePrimitive(..)
which serve a similar purpose, except that they lack any type safety because C.Bit-level copying is very slow. I don't have exact numbers at hand, but I'm considering to obtain them later once our new UAVCAN v1.0 implementations are available. Bit-level serialization is slow, but it is also generic, meaning that it is applicable always regardless of data alignment; yet, if one were to cast a very careful look at our standard definitions (and most of the known third-party definitions, e.g., ArduPilot, OlliW), they would see that majority of fields are always at least byte-aligned, meaning that slow bit-level copying could be avoided if we could determine such always-aligned fields statically.
Earlier we invented the concept of compile-time alignment checks, which was discussed at length here: https://forum.uavcan.org/t/new-dsdl-directive-for-simple-compile-time-checks/190. This same concept can be extended to facilitate static bit layout analysis in order to allow code generators to emit the fastest possible serialization code, resorting to slower methods only when faster ones cannot be proven to be safe. One can define the following arbitrary categories of serialization approaches:
memcpy()
applied to the whole data structure. This is applicable when the alignment, size, and byte order requirements of the native platform match those of UAVCAN. As you will see later, these properties are discoverable and checkable at the code generation time, provided that we can make reliable assumptions about the target platform (such as byte order and type alignment requirement, which is usuallysizeof(type)
; violation of such assumptions can be detected later at compile time so they are safe).memcpy()
applied to a given field. E.g., if we have auint64
field in a data structure, the byte order of the platform is little-endian, and the field is always aligned at the byte boundary, we can directly memcpy the field into the final buffer, again avoiding bit-level copying.The proof of alignment is obtained by PyDSDL by manipulating bit length sets of serialized representations of data types. A code generator can request PyDSDL to determine if there is such serialized representation of a composite or array type which would NOT meet a specified alignment goal (say, a byte (8 bit), or 64-bit if we're serializing a field of uint64), and then use the answer to choose the appropriate (fastest safe) serialization strategy at code generation time (there are some known performance issues: #23).
This new logic is exposed to the user via the following new API entries:
BitLengthSet
and its overloaded+
operator.CompositeType.iterate_fields_with_offsets(base_offset: BitLengthSet = None) -> Iterator[Tuple[Field, BitLengthSet]]
FixedLengthArrayType.enumerate_elements_with_offsets(base_offset: BitLengthSet = None) -> Iterator[Tuple[int, BitLengthSet]]
Early selection of the serialization strategy has an important implication on the serialization of nested data structures. A data structure can be nested into another one at an arbitrary alignment, which would defeat the purpose of static layout analysis since the code generator wouldn't be able to make any assumptions about the base offset. Additionally, the misaligned origin of a data structure does not necessarily imply that every following field of it will be misaligned as well. Consider the following example:
Due to the one-bit union tag preceding the actual value, both
a
andb
are misaligned. Now, imagine that the above union is nested into another structure as follows:The union is padded so that the one-bit tag brings the following data fields into alignment. The example demonstrates that in order to take full advantage of the layout analysis, a code generator must model nested object hierarchies holistically rather than atomically. The per-type generated serialization functions/methods may have some basic alignment requirement chosen by the author of the code generator (for example, they may require the serialization buffer to be always byte-aligned, or require that its alignment matches the largest alignment requirement of any nested type); the serialization code of an outer (containing) type would then determine statically whether the alignment requirement of a serialization function is met. If the alignment requirement is not met, the code generator would emit serialization code in-place, as if the definition of the included type were copy-pasted into the outer type, instead of invoking its serialization function.
You can see field iterators in action in this carefully crafted unit test: https://github.com/UAVCAN/pydsdl/blob/f998ad6f744b853d9b97e240ab0302df27ddd598/pydsdl/_serializable.py#L1284-L1383
...also in this Jinja2 code generation template for PyUAVCAN:
Where
alignment_prefix
is defined as:'aligned' if offset.is_aligned_at_byte() else 'unaligned'
.In the following example (sourced from PyUAVCAN as well) look at the case of
CompositeType
, where we select whether the current item matches the alignment requirement of the serialization method oft
(which is_serialize_aligned_(..)
). If there is a match, we invoke the method; otherwise, we emit serialization code in-place.One can see more examples in my PyUAVCAN repo (which is still a WIP of course): https://github.com/pavel-kirienko/pyuavcan/blob/uavcan-v1.0/pyuavcan/dsdl/_templates/serialization.j2, also https://github.com/pavel-kirienko/pyuavcan/blob/6f9234ab918beefce56ef3f58920773df29079e5/pyuavcan/dsdl/_serialized_representation/_serializer.py#L56-L187
I expect that this feature will allow us to greatly simplify implementations. Particularly, libuavcan may no longer need the rather complex primitive marshaling templates, since generated serialization methods can operate on raw byte pointers now.