eclipse-cyclonedds / cyclonedds

Eclipse Cyclone DDS project
https://projects.eclipse.org/projects/iot.cyclonedds
Other
854 stars 352 forks source link

Issue publishing topic with union whose items have nested structs #2005

Open stevemcnamaraultra opened 4 months ago

stevemcnamaraultra commented 4 months ago

We have hit an issue with the C++ code generation for topics that use unions which have nested structures.

The issue relates to the isselfcontained() code that is generated. Our topic gets incorrectly marked as self contained. This results in issues if the first publish to the topic uses a smaller union choice than the second.

cycloneBug.zip

The attached zip contains a topic and a test program that reproduces the bug.

The example code shows a publish of the union using the smallest of two supported types and then a publish of the larger.

    AccountResponse m1;
    m1.response().changePasswordResponseValue(ChangePasswordResponse_t());

    AccountResponse m2;
    m2.response().authenticateUserResponseValue(AuthenticateUserResponse_t());

    try {
        std::cout << "=== Will write m1" << std::endl;
        writer.write(m1);

        std::cout << "=== Will write m2" << std::endl;
        writer.write(m2);

        std::cout << "=== Write done" << std::endl;
    } catch (const std::exception& ex) {
        std::cerr << "=== [Publisher] Exception: " << ex.what() << std::endl;
        return EXIT_FAILURE;
    }

The second publish fails with the message:

Error Bad Parameter - write failed

Tracing this fault led me to the caching of the topic size that occurs if the topic is not correctly marked as not self contained.

This issue appears to be related, but not the same as, https://github.com/eclipse-cyclonedds/cyclonedds-cxx/issues/461.

stevemcnamaraultra commented 4 months ago

Looking further at the generator code the function sc_union() checks each case is self contained but does not check that they would all generate to the same byte size. As isSelfContained() is used to allocate a fixed buffer for the serialisation this causes the issue observed. A simple fix works in my scenario: make sc_union always return false, indicating that any type that contains a union will be marked as not self contained. A more elegant fix would attempt to evaluate the size of each union case and only if they differ return false. This may be a level of complexity though that has limited benefits for most situations.

smnrgrs commented 3 months ago

This is quite a serious bug as it prevents Cyclone from being able to publish topics which have unions containing structs. Do we need to open a pull request with a proposed change @eboasson?

eboasson commented 3 months ago

isSelfContained appears to be used in two places only:

So, yes, it looks to me like your workaround is a very sensible one, and that it would be better to merge it in as an improvement over the current situation. Then we should also create an issue that restore the zero-copy behaviour.

If you can do a PR, that'd be great. I don't seem to manage to fix things as quickly as I used to ...