OpenCyphal / nunavut

Generate code from DSDL using PyDSDL and Jinja2
https://nunavut.readthedocs.io/
Other
36 stars 24 forks source link

Define variable-length array type for C #191

Open thirtytwobits opened 3 years ago

thirtytwobits commented 3 years ago

This is a proposal to allow generation of a variable-length array type in C leaving the implementation up to the user.

For example, the standard Quaternion type:


typedef struct
{
    /// saturated float32[4] wxyz
    float wxyz[4];
} uavcan_si_unit_angle_Quaternion_1_0;

when using the proposed language option of `--no-static-arrays' could be generated like this:

typedef struct
{
    /// saturated float32[4] wxyz
    float* wxyz;
    size_t wxyz_len;
} uavcan_si_unit_angle_Quaternion_1_0;

An alternative, that would be more complex but would require less unsafe pointer magic, is to create a Vector protocol. Something like this:

typedef NunavutVectorType_Float
{
    float* data;
    size_t length;
    size_t max_length;
    bool (*resize)(NunavutVectorType_Float*, size_t);
} NunavutVector_Float;

extern NunavutVector_Float* init_nunavut_vector_float(NunavutVector_Float*);

NunavutVector_Float then must be supplied by the user (the linker will enforce this) and the user must provide the resize method which will be called by deserialization routines when Nunavut needs more memory.

This is a half-baked proposal to encourage a discussion. Please comment.

pavel-kirienko commented 3 years ago

If one component of my application is unprepared to deal with non-static member arrays, should I:

a) run Nunavut twice with different options; b) specify which types (and fields?) should use dynamically allocated arrays; c) select behavior at compile-time via preprocessor (similar to #189); d) expect Nunavut to generate two instances of every data type that contains arrays or other nested types that contain arrays somewhere down the tree: one static and one dynamic.

72eb5e1a15605310

Another idea (not so good) is to provide an option for generating flex arrays if the array is the last field of the type. This would address Peter's case specifically but it would be useless outside of that specific case I suspect.

PetervdPerk-NXP commented 3 years ago

So the proposal in #189 I had 2 assumptions, first we're the publisher thus we know the theoretical maximum size of an array and it applies only to variable length arrays, since in a specification things might be over dimensioned.

We could expand this to fixed size array and subscribers. However to send/receive a message we've always to allocate the whole fixed array, it's specified like that anyways otherwise It would've been variable length.

Having a --no-statically-allocated-variable-option and changing the array allocation is a bit nicer, since we get rid of all the pre-processor ifdefs, when sending we allocate the correct size of an array to the stack and set the pointer in the struct before serializing. However it also applies to subscribing as well, which means you've to do it there as well, #189 was a bit nicer in that regard because it didn't influence it if the #ifdef wasn't overriden.

user must provide the resize method which will be called by deserialization routines when Nunavut needs more memory.

Regarding the vector solution it means we've to use an allocator and use probably memory on the heap for this.

Coming to a conclusion I'm fine with either the solution from #189 or the --no-statically-allocated-variable-option.

pavel-kirienko commented 3 years ago

Having a --no-statically-allocated-variable-option and changing the array allocation is a bit nicer, since we get rid of all the pre-processor ifdefs

This is true but consider that indexing on a pointer (pointer[x]) also violates MISRA and probably other high-integrity coding standards. From the safety standpoint, #189 is superior because it involves less pointer arithmetic.

Regarding the vector solution it means we've to use an allocator and use probably memory on the heap for this.

I missed this in my response but any solution that requires heap memory to serialize or deserialize a message is less desirable because heap allocators inherently multiply the memory requirement.

thirtytwobits commented 3 years ago

I missed this in my response but any solution that requires heap memory to serialize or deserialize a message is less desirable because heap allocators inherently multiply the memory requirement.

allocation != heap usage

PetervdPerk-NXP commented 3 years ago

So let's go off on a tangent here, why not make an option where we expose the internal serialization/deserialization functions. Which helps an implementer to manually serialize/deserialize then but doesn't have to allocate struct. Furthermore the implementer can use block scoping to minimize stack usage further as well.

A manual serialization example would be

uint8_t buffer[100];
size_t buffer_size = sizeof(buffer);

(void) memset(&buffer[0], 0, buffer_size ); // We could move this to a init function as well

size_t offset_bits = 0;

{
    reg_drone_service_common_Heartbeat_0_1 heartbeat;

    heartbeat = getStatus(); // Dummy get real data here

    if(reg_drone_service_battery_Status_0_2_heartbeat_serialize_(&heartbeat, &buffer, &buffer_size, &offset_bits) > 0){
        return -1;
    }
}
{
    uavcan_si_unit_temperature_Scalar_1_0 temps[reg_drone_service_battery_Status_0_2_temperature_min_max_ARRAY_CAPACITY_];

    temps[0] = 0; // Dummy get real data here
    temps[1] = 0; // Dummy get real data here

    if(reg_drone_service_battery_Status_0_2_temperature_min_max_serialize_(temps, &buffer,  &buffer_sizee, &offset_bits) > 0){
        return -1;
    }
}

And the an generated helper function would look like something like this

static inline int8_t reg_drone_service_battery_Status_0_2_heartbeat_serialize_(const reg_drone_service_common_Heartbeat_0_1* const obj, uint8_t* const buffer,  size_t* const inout_buffer_size_bytes
    size_t* const offset_bits)
{
    {   // reg.drone.service.common.Heartbeat.0.1 heartbeat
        size_t _size_bytes0_ = 2UL;  // Nested object (max) size, in bytes.
        int8_t _err0_ = reg_drone_service_common_Heartbeat_0_1_serialize_(
            &obj, &buffer[*offset_bits / 8U], &_size_bytes0_);
        if (_err0_ < 0)
        {
            return _err0_;
        }
        // It is assumed that we know the exact type of the serialized entity, hence we expect the size to match.
        *offset_bits += _size_bytes0_ * 8U;  // Advance by the size of the nested object.
    }
}
pavel-kirienko commented 3 years ago

I doubt it would make your job a lot easier compared to purely manual serialization. If we kept the current logic more or less intact sans spreading it across different functions per-field, you would need to ensure that the alignment requirements are satisfied before invoking field serializers. If we made field serializers enforce alignment always, this would eliminate some optimization opportunities that are currently available for the code generator (it can omit some alignment checks and padding insertion if it is proven that they are unnecessary). I think implementing this takes too much work for a little gain so I wouldn't support it.