loan_sample() returns illegal oparation when using FastDDS-Gen 2.5.0 and fastrtps.so.2.3.4 and 2.10.0

amdsobhy commented 1 year ago

loan_sample() returns RETCODE_ILLEGAL_OPERATION when using FastDDS-Gen 2.5.0, tested with fastrtps.so.2.3.4 and 2.10

Issue does not happen when using FastDDS-Gen 2.1.0

MiguelCompany commented 1 year ago

@amdsobhy Which IDL produces that behavior?

amdsobhy commented 1 year ago

@MiguelCompany here is the IDL:

module DDS {

#define MAX_SIZE (1920 * 1200 * 3)

    struct Header
    {
        unsigned long long  eightbytes0;
        long                fourbytes1;
        long                fourbytes2;
        long                fourbytes3;
    };

    struct foo
    {
        Header data0;
        long fourbytes1;
        long fourbytes2;
        long fourbytes3;
        unsigned long fourbytes4;
        unsigned long fourbytes5;
        long fourbytes6;
        octet buffer[MAX_SIZE];
    };

};

MiguelCompany commented 1 year ago

@amdsobhy The in-memory representation of that structure will not match the CDR representation. This implies that foo should not be considered plain and, as such, loan_sample() cannot be used.

This is due to the padding present between data0 and fourbytes1. If you want to make the type plain, you have several options:

Change Header::eightbytes0 into unsigned long eightbytes0_high and unsigned long eightbytes0_low
Change Header::eightbytes0 to the end of Header
Add long _padding_; at the end of Header
Put the fields of Header directly in foo

amdsobhy commented 1 year ago

Thank you for the reply and the explanation. I thought that the generated code would handle such situation. Why doesn't cdr handle this situation so that the user does not have to worry about it?

How can one guarantee in future implementation that the in-memory representation matches CDR representation? What are the rules to garantee such matching?

I read that according to the standard padding is inserted into the struct to make it's size in multiples of 8 bytes. In my situation the header is 20 bytes so by adding another 4 bytes padding it becomes 24 bytes and a multiple of 8 and thus it fixes the problem of misalignment between cdr representation and memory representation, but what about the situation in point #3 you mentioned above? why does moving the eightbytes0 variable to the bottom of the Header struct solve the problem when the size of the struct is still the same?

MiguelCompany commented 1 year ago

How can one guarantee in future implementation that the in-memory representation matches CDR representation? What are the rules to garantee such matching?

Do not use string, sequences, maps, or unions
If possible, avoid using nested structures. For your use-case, you could use inheritance, i.e. struct foo : Header
If nested structures are required, put the field with the biggest alignment at the end
Check the in-memory representation matches the one of a type without nested structures.

why does moving the eightbytes0 variable to the bottom of the Header struct solve the problem when the size of the struct is still the same?

I have prepared this example so you can see the differences in memory representation

amdsobhy commented 1 year ago

Thank you for the great example. I thought the padding was inserted using fastcdr during serialization and not the compiler. From your example I see that this is the default padding inserted by the complier.

Would it be okay to modify the generated headers and include compiler directives to disable padding? Also why doesn't fastcdr handle the compiler padding during serialization and deserialization?

As far as I know compiler padding is different accross architectures, compilers and compiler versions, so what happens when two different machines are communicating over dds and each one of them has a different point of view of what the padding in the message look like? Shouldn't padding be abstracted from the reader and writer of the message through the middleware?

nyanpasua commented 12 months ago

same problem, is this a bug in ddsgen v2.5.0?

MiguelCompany commented 12 months ago

@amdsobhy Sorry for my late reply. Let me try to explain how Fast DDS works behind the scenes when loan_sample is called.

When the DataWriter is created, the type support is invoked to query the maximum size of the serialized (i.e. CDR) payload. That maximum size will include the size of the padding according to the CDR spec. The DataWriter will then pre-allocate space for a certain number of those serialized payloads, depending on the ResourceLimits QoS.

The purpose of loan_sample is to return a pointer to one of those payloads, in order for it to be used as a pointer to the generated type. This way, CDR serialization is not performed when calling write with a pointer that was loaned.

So in order for that operation to be legal, the CDR representation should match the in-memory representation of the generated type. That implies that the CDR padding and the compiler padding should match. This is what the is_plain method from the type support is responsible for.

The type generated by the IDL here would most probably have a padding different from the one mandated by the CDR spec.

On previous Fast DDS Gen versions, the is_plain method implementation relied just on the type of each field. This could lead to wrongly considering the type generated by the IDL here as plain.

On Fast DDS Gen 2.5.0, we included a mechanism to better implement is_plain. Apart from checking the type of each field, the generated code will check whether the offset of the last field + its length equals the length of the CDR serialization.

So, answering @nyanpasua , this is not a bug in v2.5.0. It is a bug-fix.

MiguelCompany commented 12 months ago

As far as I know compiler padding is different accross architectures, compilers and compiler versions, so what happens when two different machines are communicating over dds and each one of them has a different point of view of what the padding in the message look like? Shouldn't padding be abstracted from the reader and writer of the message through the middleware?

When not using loans, CDR serialization and deserialization will take place. In this case, and the writer and the reader are abstracted with regards to padding.

JLBuenoLopez commented 11 months ago

I am going to proceed and close this issue as it has already been answered.

davidqin1986 commented 10 months ago

Thanks for @MiguelCompany 's explanation, but I can't agree your opinion. I think this modification is a kind of regression.

The original intention of idl definition is to facilitate users. The purpose of loan sample is to improve performance. But this ”is_ plain” implementation imposes so many limitations on users, and also decrease usage scenario. Don't stick to the CDR specification. After all, the zero copy implementation of fastdds is also not within the specifications of RTPS.

In addition, the development trend of modern C++ is the convenience of developers, otherwise there wouldn't be so many syntax sugars.

eProsima / Fast-DDS-Gen

loan_sample() returns illegal oparation when using FastDDS-Gen 2.5.0 and fastrtps.so.2.3.4 and 2.10.0 #193