eclipse-cyclonedds / cyclonedds

Eclipse Cyclone DDS project
https://projects.eclipse.org/projects/iot.cyclonedds
Other
870 stars 357 forks source link

Non-consistent behavior of dds_sample_info struct #1597

Open siddux opened 1 year ago

siddux commented 1 year ago

Hi,

I'm using dds implementation from ROS2 Galactic. I observed strange behavior on the code and after debugging it I've found what it seems a problem. I'm using the dds_sample_info to retrieve info about the received messages and decide where to process them or just ignore. At some point of my code I've the following function lines:

return_code = dds_take(reader, samples, infos, MaxSamplesNum, MaxSamplesNum);

    if (return_code < 0)
    {
      DDS_FATAL("dds_take: %s\n", dds_strretcode(-return_code));
      msgs.clear();
      return msgs;
    }

    if (return_code > 0)
    {
      for (size_t i = 0; i < MaxSamplesNum; ++i)
      {
        auto debug_shared_msgs = shared_msgs[i];
        auto debug_info = infos[i];
        if (infos[i].valid_data)
          msgs.push_back(std::shared_ptr<const Message>(shared_msgs[i]));
      }
      return msgs;
    }

when receiving valid messages all work as expected. The problem comes when I receive empty messages (which has not been initialized). When checking if the message is valid, accessing the 'valid_data member from the struct it returns an integer (non-zero). Below you can find a screenshot of the debugger, sincerely I couldn't understand why the boolean returns "random" integer values.

debug3

At this moment I can bypass this just changing from:

if (infos[i].valid_data)

to:

if (infos[i].valid_data == true)

Is this behavior correct? Is there any more elegant solution for this problem?

Thanks.

eboasson commented 1 year ago

The contents of debug_info is pure garbage, I would say, every single field as very impossible (e.g., instance_handle = 0), or at least an implausible (e.g., source_timestamp = 37ns 1-1-1970 00:00 GMT) value.

dds_take returns the number of samples it read, so if 0 < return_code < MaxSamplesNum then the loop is simply reading garbage for return_codei < MaxSamplesNum. That would be my guess.

aaronchongth commented 1 year ago

Thanks for the explanation @eboasson! In the case of return_codei < MaxSamplesNum, we should just be checking that samples[i] != NULL for a valid message? Similar to https://github.com/eclipse-cyclonedds/cyclonedds/blob/bc91e11b5c74a592a41cb673a8af16d5a25cb4f5/src/core/ddsc/tests/loan.c#L195-L197

eboasson commented 1 year ago

@aaronchongth, you should always only check the valid_data flag to know what it is you got, and then only for 0i < return_code. The entries return_codei < MaxSamplesNum are best considered garbage that is not to be looked at 🙂

(In reality they aren't garbage, the pointers'll will point to somewhat valid samples, butbut I'd rather keep as much freedom there as I possibly can 😁)