OpenCyphal / pydsdl

Cyphal DSDL processing front end implemented in Python
https://opencyphal.org
MIT License
11 stars 9 forks source link

Static layout analysis hooks for near-zero-cost serialization/deserialization #24

Closed pavel-kirienko closed 5 years ago

pavel-kirienko commented 5 years ago

In Libuavcan, there is a huge chunk of heavily templated code responsible for data serialization, which upon monomorphization resolves into a series of invocations of low-level bit-level data copy routine: https://github.com/UAVCAN/libuavcan/blob/fd8ba19bc9c09c05a1ab60289b3e7158810e9bd0/libuavcan/src/marshal/uc_bit_array_copy.cpp#L12-L58. In Libcanard, there are basic functions canardEncodePrimitive(..) and canardDecodePrimitive(..) which serve a similar purpose, except that they lack any type safety because C.

Bit-level copying is very slow. I don't have exact numbers at hand, but I'm considering to obtain them later once our new UAVCAN v1.0 implementations are available. Bit-level serialization is slow, but it is also generic, meaning that it is applicable always regardless of data alignment; yet, if one were to cast a very careful look at our standard definitions (and most of the known third-party definitions, e.g., ArduPilot, OlliW), they would see that majority of fields are always at least byte-aligned, meaning that slow bit-level copying could be avoided if we could determine such always-aligned fields statically.

Earlier we invented the concept of compile-time alignment checks, which was discussed at length here: https://forum.uavcan.org/t/new-dsdl-directive-for-simple-compile-time-checks/190. This same concept can be extended to facilitate static bit layout analysis in order to allow code generators to emit the fastest possible serialization code, resorting to slower methods only when faster ones cannot be proven to be safe. One can define the following arbitrary categories of serialization approaches:

The proof of alignment is obtained by PyDSDL by manipulating bit length sets of serialized representations of data types. A code generator can request PyDSDL to determine if there is such serialized representation of a composite or array type which would NOT meet a specified alignment goal (say, a byte (8 bit), or 64-bit if we're serializing a field of uint64), and then use the answer to choose the appropriate (fastest safe) serialization strategy at code generation time (there are some known performance issues: #23).

This new logic is exposed to the user via the following new API entries:

Early selection of the serialization strategy has an important implication on the serialization of nested data structures. A data structure can be nested into another one at an arbitrary alignment, which would defeat the purpose of static layout analysis since the code generator wouldn't be able to make any assumptions about the base offset. Additionally, the misaligned origin of a data structure does not necessarily imply that every following field of it will be misaligned as well. Consider the following example:

@union
uint8 a
uint8 b

Due to the one-bit union tag preceding the actual value, both a and b are misaligned. Now, imagine that the above union is nested into another structure as follows:

void7
U.1.0 the_union

The union is padded so that the one-bit tag brings the following data fields into alignment. The example demonstrates that in order to take full advantage of the layout analysis, a code generator must model nested object hierarchies holistically rather than atomically. The per-type generated serialization functions/methods may have some basic alignment requirement chosen by the author of the code generator (for example, they may require the serialization buffer to be always byte-aligned, or require that its alignment matches the largest alignment requirement of any nested type); the serialization code of an outer (containing) type would then determine statically whether the alignment requirement of a serialization function is met. If the alignment requirement is not met, the code generator would emit serialization code in-place, as if the definition of the included type were copy-pasted into the outer type, instead of invoking its serialization function.

You can see field iterators in action in this carefully crafted unit test: https://github.com/UAVCAN/pydsdl/blob/f998ad6f744b853d9b97e240ab0302df27ddd598/pydsdl/_serializable.py#L1284-L1383

...also in this Jinja2 code generation template for PyUAVCAN:

{%- macro _serialize_variable_length_array(t, ref, offset) -%}
    # Length field byte-aligned: {{ offset.is_aligned_at_byte() }}; {# -#}
     first element byte-aligned: {{ (offset + t.length_field_type.bit_length).is_aligned_at_byte() }}; {# -#}
      all elements byte-aligned: {{ (offset + t.bit_length_set).is_aligned_at_byte() }}.
    assert len({{ ref }}) <= {{ t.capacity }}, '{{ ref }}: {{ t }}'
    {{ _serialize_integer(t.length_field_type, 'len(%s)'|format(ref), offset) }}
{%- if t.element_type is BooleanType %}
    _ser_.add_{{ (offset + t.length_field_type.bit_length)|alignment_prefix }}_array_of_bits({{ ref }})
{%- elif t.element_type is PrimitiveType and t.element_type.standard_bit_length %}
    _ser_.add_{{ (offset + t.length_field_type.bit_length)|alignment_prefix -}}
          _array_of_standard_bit_length_primitives({{ ref }})
{%- else %}
    for _element_ in {{ ref }}:
        {{ _serialize_any(t.element_type, '_element_', offset + t.bit_length_set)|indent }}
{%- endif %}
{%- endmacro -%}

Where alignment_prefix is defined as: 'aligned' if offset.is_aligned_at_byte() else 'unaligned'.

In the following example (sourced from PyUAVCAN as well) look at the case of CompositeType, where we select whether the current item matches the alignment requirement of the serialization method of t (which is _serialize_aligned_(..)). If there is a match, we invoke the method; otherwise, we emit serialization code in-place.

{%- macro _serialize_any(t, ref, offset) -%}
    {%- if offset.is_aligned_at_byte() -%}
    assert _ser_.current_bit_length % 8 == 0, '{{ ref }}: {{ t }}'
    {% endif -%}
    {%- if t is VoidType -%}                    _ser_.skip_bits({{ t.bit_length }})
    {%- elif t is BooleanType -%}               _ser_.add_unaligned_bit({{ ref }})
    {%- elif t is IntegerType -%}               {{ _serialize_integer(t, ref, offset) }}
    {%- elif t is FloatType -%}                 {{ _serialize_float(t, ref, offset) }}
    {#- Despite the apparent similarities, fixed and variable arrays are very different when it comes to serialization,
     #- mostly because of the different logic of offset computation. -#}
    {%- elif t is FixedLengthArrayType -%}      {{ _serialize_fixed_length_array(t, ref, offset) }}
    {%- elif t is VariableLengthArrayType -%}   {{ _serialize_variable_length_array(t, ref, offset) }}
    {%- elif t is CompositeType -%}
        {%- if offset.is_aligned_at_byte() -%}
    {{ ref }}._serialize_aligned_(_ser_)  # Delegating because this object is always byte-aligned.
        {%- else -%}
    # Object {{ ref }} is not always byte-aligned, serializing in-place.
    {{ _serialize_composite(t, ref, offset) }}
        {%- endif -%}
    {%- else -%}{%- assert False -%}
    {%- endif -%}
{%- endmacro -%}

One can see more examples in my PyUAVCAN repo (which is still a WIP of course): https://github.com/pavel-kirienko/pyuavcan/blob/uavcan-v1.0/pyuavcan/dsdl/_templates/serialization.j2, also https://github.com/pavel-kirienko/pyuavcan/blob/6f9234ab918beefce56ef3f58920773df29079e5/pyuavcan/dsdl/_serialized_representation/_serializer.py#L56-L187

I expect that this feature will allow us to greatly simplify implementations. Particularly, libuavcan may no longer need the rather complex primitive marshaling templates, since generated serialization methods can operate on raw byte pointers now.

coveralls commented 5 years ago

Pull Request Test Coverage Report for Build 213


Totals Coverage Status
Change from base Build 194: 0.0%
Covered Lines: 2652
Relevant Lines: 2652

๐Ÿ’› - Coveralls
coveralls commented 5 years ago

Pull Request Test Coverage Report for Build 213


Totals Coverage Status
Change from base Build 194: 0.0%
Covered Lines: 2652
Relevant Lines: 2652

๐Ÿ’› - Coveralls
coveralls commented 5 years ago

Pull Request Test Coverage Report for Build 223


Totals Coverage Status
Change from base Build 194: 0.0%
Covered Lines: 2657
Relevant Lines: 2657

๐Ÿ’› - Coveralls