eclipse-cyclonedds / cyclonedds

Eclipse Cyclone DDS project
https://projects.eclipse.org/projects/iot.cyclonedds
Other
888 stars 363 forks source link

How to use `dds_readcdr` and `dds_writecdr` on a topic without knowing the message type(i.e., no IDL given)? #2099

Closed t0ny-peng closed 1 month ago

t0ny-peng commented 1 month ago

TL;DR 🙏

How to use dds_readcdr and dds_writecdr with a topic whose data type is not known, i.e., no IDL and dds_topic_descriptor_t.


Detailed version:

We are building a gateway to translate to/from a ROS2 message from/to our internal messaging framework which is also using CDR format. E.g., for the following message we want the ROS2 apps to talk to our internal apps:

IDL (from ROS2 .msg file)      | Internal msg definition(ouw own framework)
module HelloWorldData          | namespace HelloWorldData
{                              | {
    struct Msg                 |   struct Msg
    {                          |   {
        @key long userID;      |     userID: int32_t;
        string message;        |     message: string;
    };                         |   }
};                             | }

E.g., the gateway will receive message from abcd topic and send to abcd topic in our internal framework, and it's guaranteed that the topic abcd will only have one message type HelloWorldData, all CDR-serialized so topic message type check can be omitted.

Design option 1: Specific gateway(won't go this way)

The easiest design is to write a gateway using ROS2(or native CycloneDDS APIs) to get the message from a subscriber, extract userID and message then assign their value to internal message object. The problem is that the gateway needs to be built for every new message.

Design option 2: Generic gateway

Our ROS2 R&D apps uses CycloneDDS as the DDS vendor. Our internal COM framework supports sending a CDR-serialized buffer directly to a topic(and receive from that topic). The question is how to do similar thing with CycloneDDS. I found that dds_takecdr and dds_writecdr can be used, but not sure what topic to give it

Note that dds_takecdr needs a reader, the creation of a reader needs a topic, and the creation of a topic needs a dds_topic_descriptor_t. Normally the dds_topic_descriptor_t of a message is defined in the auto-generated .c file from the .idl file, and it contains information of the size of the message, alignment, the name and most importantly the _ops, which are all unknown for the generic gateway.

⚠️So the question is, is there a way to construct an anonymous dds_topic_descriptor_t for a topic to enable the use of dds_readcdr. Thanks!

We'll handle the ROS2<->CycloneDDS topic mapping.

A ref for my self: DDS-ROS2 communication discussion https://github.com/eclipse-cyclonedds/cyclonedds/issues/1412

eboasson commented 1 month ago

Usually I try things before saying "do this" or "do that", but today I am going out on a limb and hope I am not overlooking something ...

How to use dds_readcdr and dds_writecdr with a topic whose data type is not known, i.e., no IDL and dds_topic_descriptor_t.

Better use dds_forwardcdr, but that's a detail.

The question is how to do similar thing with CycloneDDS. I found that dds_takecdr and dds_writecdr can be used, but not sure what topic to give it

A little secret about DDS: if all that is available are topic name and type name then reader/writer matching just looks at those two names (and the QoS).[^1]

A little secret about Cyclone: it'll happily accept extra bytes on incoming data, so subscribing to a type struct T { octet x; }; will get you the CDR if you manage to avoid type checking. There are two ways to do that:

(I would go with the first, because I would find it easier not to have to worry about having two different Cyclone binaries in the system.)

For forwarding it, there's another complication: dds_forwardcdr checks the type. It forwards the ddsi_serdata thing unchanged only if the type of the writer and the reader is the same. If the two are not the same, it internally constructs a new ddsi_serdata from the CDR contained in the input. It does this specifically to allow forwarding across DDS domains.

In that process, the C binding (and so the stuff tied to dds_topic_descriptor_t) ends up simply memcpying the CDR, and so I expect it'll happily forward the full sample despite using a type definition that says it is a but a single byte. IINM the C++ binding will deserialize but I am not sure if it then reserializes ... In short, better to stick to C for this type of game.

Also you have to be careful with keys. You would be able to tell from the DDS specification, but the DDSI specification — incorrectly, in my view — uses different ids for readers and writers with a key fields than for those without a key field, and then says the two can't match. So you may have to do two topics, one where there is a key field and one where there isn't.

Using a random byte as a key field might cause real problems. I can't come up with a trivial to avoid that problem, but then, too, there is a solution: you would have to implement your own sertype/serdata pair that does exactly what you want and using dds_create_topic_sertype instead of dds_create_topic. It is not very complicated because you just want to pass around a CDR blob, and the key you would probably want to ignore while using a KEEP_ALL history. https://github.com/eclipse-cyclonedds/cyclonedds/blob/master/src/core/ddsc/tests/cdr.c will no doubt give you some ideas (that implements a key-value pair, so it is mostly stripping things out).

In short, I think you'll be fine if you use struct T { octet x; }; as long as you strip the metadata from the descriptor and keys don't cause trouble.

Again, I didn't try it. If it doesn't work or something is unclear, don't hesitate to ask!

[^1]: This is the old behaviour. Implementations supporting XTypes default to checking assignability but still support this because of interoperability with other implementations. An XTypes reader/writer can set a QoS to enforce type checks, and then it won't match using only the names, but I don't think that's an issue for you.

t0ny-peng commented 1 month ago

Hi @eboasson. Your thought experiment turns out exactly as how it works in reality. Must have a powerful mental compiler!

I firstly tried your options 2, to disable TYPELIB, type discovery and topic discovery. I modify the HelloWorld example to send out a message of an int 0xaabbccdd and a string Hello World using the existing and normal message using the normal dds_write. Nothing fancy here.

Then on the receiver side, I manually generated a .h and .c file from a struct T { octet x; }; IDL file, and changed the .m_typename to "HelloWorldData::Msg" to match the publisher's type name on that topic. The C file is quite simple:

Expand to see untyped_struct.c ```c #include "untyped_struct.h" static const uint32_t T_ops [] = { /* T */ DDS_OP_ADR | DDS_OP_TYPE_1BY, offsetof (T, x), DDS_OP_RTS }; const dds_topic_descriptor_t T_desc = { .m_size = sizeof (T), .m_align = dds_alignof (T), .m_flagset = DDS_TOPIC_FIXED_SIZE, .m_nkeys = 0u, .m_typename = "HelloWorldData::Msg", <<<=== Manually changed to match the publisher, but is there a way to avoid this? .m_keys = NULL, .m_nops = 2, .m_ops = T_ops, .m_meta = "" }; ```

With these changes, the following code works:

    struct ddsi_serdata * d;
    dds_sample_info_t si;
    rc = dds_readcdr(reader, &d, 1, &si, DDS_NOT_READ_SAMPLE_STATE | DDS_NEW_VIEW_STATE | DDS_ALIVE_INSTANCE_STATE);

    if (rc > 0)
    {
      // Get the message size, returns 24
      size_t size = ddsi_serdata_size(d);

      char* buf = (char*)malloc(size); // Prepares the buffer
      ddsi_serdata_to_ser(d, 0, size, buf); // Extract the data from the buffer
      printf ("=== [Subscriber] Received. Size: %zu\n", size); // Prints 24
      // Print out the memory bytes in buf for debugging
      for (size_t i = 0; i < size; i++) {
        printf("%02x ", buf[i]);
      }
      ...
    }

The memory of the buffer is like:

(gdb) x/24x buf
0x00    0x01    0x00    0x00    0xdd    0xcc    0xbb    0xaa // 4 unknown bytes plus 4-byte int in little-endian
0x0c    0x00    0x00    0x00    0x48    0x65    0x6c    0x6c // 4 bytes for string size followed by `Hello World\0`
0x6f    0x20    0x57    0x6f    0x72    0x6c    0x64    0x00

The question with this setup for a generic gateway is that, the dummy struct m_typename still need to match the sender of a topic, i.e., HelloWorldData::Msg. However for a generic gateway, it does not know what type of message if sent on a topic abcd. Is there a way to avoid the type name check on a topic in CycloneDDS?

I think it should look like how ROS2 bag works. Given a topic, record everything happening on it without knowing the type of the message. Does it use the builtin topics to get the message type?

Oh BTW do you happen to know what's the first 4 bytes in the CDR-serialized representation of the msg? I can't find what it is in the CDR specification. Thanks! 🙏

eboasson commented 1 month ago

With these changes, the following code works

Great!

The question with this setup for a generic gateway is that, the dummy struct m_typename still need to match the sender of a topic, i.e., HelloWorldData::Msg. However for a generic gateway, it does not know what type of message if sent on a topic abcd. Is there a way to avoid the type name check on a topic in CycloneDDS?

No, I'm afraid you can't skip the type name check — or more precisely: you can skip the type name check, but then you have to have the full type information so that it can do the "assignability" check, and that's exactly what you're trying to avoid.

Once upon a time, dds_create_topic stored the dds_topic_descriptor_t pointer that you passed in, but some 4.5 years ago it started making a copy. So you can do:

extern const dds_topic_descriptor_t T_desc;

dds_entity_t subscribe_to_topic (..., const char *topic_name, const char *type_name, ...) {
  dds_topic_descriptor_t T_desc_copy = T_desc; // shallow copy is fine here
  T_desc_copy.m_type_name = type_name;
  dds_entity_t tp = dds_create_topic (..., topic_name, &T_desc_copy, ...);
  dds_entity_t rd = dds_create_reader (..., tp, ...);
  dds_delete (tp); // yes, this is allowed :)
  return rd;
}

Wouldn't that solve your problem? If you need to find the type name it is probably easiest to subscribe to the DCPSPublication built-in topic, not unlike https://github.com/eclipse-cyclonedds/cyclonedds/blob/master/examples/listtopics/listtopics.c or https://github.com/eclipse-cyclonedds/cyclonedds/blob/master/examples/dynsub/dynsub.c do.

I think it should look like how ROS2 bag works. Given a topic, record everything happening on it without knowing the type of the message. Does it use the builtin topics to get the message type?

IIRC ROS2 bag indeed uses the discovery information (ROS2 partially relies on DDS discovery, and partially relies on data it publishes itself). I think it creates a topic of the correct type by looking for a ROS2 type support library with the name of the type, but then it really just reads/writes CDR so it doesn't actually use the type information.

The memory of the buffer is like:

(gdb) x/24x buf
0x00    0x01    0x00    0x00    0xdd    0xcc    0xbb    0xaa // 4 unknown bytes plus 4-byte int in little-endian
0x0c    0x00    0x00    0x00    0x48    0x65    0x6c    0x6c // 4 bytes for string size followed by `Hello World\0`
0x6f    0x20    0x57    0x6f    0x72    0x6c    0x64    0x00

Oh BTW do you happen to know what's the first 4 bytes in the CDR-serialized representation of the msg? I can't find what it is in the CDR specification. Thanks! 🙏

You're correct that those bytes are not described in the CDR specification. They are the encoding format and options defined in the DDSI specification. In hindsight it was a ... suboptimal ... idea to make them part of the CDR like this in Cyclone, and I still have a desire to change it. Given that some people use these interfaces doing that without breaking applications in the nastiest way possible will take some effort.

The first two bytes are big-endian number telling you how the data is encoded (see https://github.com/eclipse-cyclonedds/cyclonedds/blob/13cf22ec7f325377f038a076cbfd6437ee2d4d5e/src/core/ddsi/include/dds/ddsi/ddsi_protocol.h#L136 — 0x0001 means CDR, little-endian), the second two bytes contain options. The two least-significant bits tell you how many padding bytes were appended to the data, but that is optional (I suppose except when using the (broken) XCDR1 encoding of old versions of XTypes). I would ignore them, if I were you 🙂

t0ny-peng commented 1 month ago

@eboasson Copying the message desc is really a neat trick. For the gateway I have managed to make 3 directions work, marked as green arrows in the following diagram. The gateway receives CDR-serialized message from the upper-left CycloneDDS sender using the struct T{octet x;} trick with matching topic name and typename(by modifying the type name locally).

image

However there's still one thing missing. For the message coming from the internal framework, it contains a CDR serialized message(the same format as CycloneDDS). dds_forwardcdr expects a dds_entity_t writer and a ddsi_serdata. It looks like that the major use case of dds_forwardcdr is e.g. when you have two domains and you want to call dds_takecdr in one, get a ddsi_serdata and then send it with dds_forwardcdr in another domain, as shown in the cdr.c test.

In our case, since the internal framework doesn't use CycloneDDS, it seems that the ddsi_serdata needs to be manually created? I tried to follow the way how you created the ddsi_sertype and the ddsi_serdata in rmw_cyclonedds_cpp but it's a little complicated. In addition, the default struct ddsi_sertype_ops dds_sertype_ops_default is only accessible from CycloneDDS src and not from user application. We probably need to implement all the functions that a ddsi_sertype_ops requires.

Giving that constructing a ddsi_sertype and a ddsi_serdata is complicated, I wonder if the struct T{octet x;} trick can be used when sending arbitrary length of CDR-serialized data? I.e., is there a way to get an empty ddsi_serdata from a such dummy message, and send it using a normal data writer (created with modified typename with the correct typename).

Thanks! And sorry if I had too many questions! I guess that's the last piece of the puzzle


Update: I found dds_get_entity_sertype. Trying with it to construct a ddsi_serdata

t0ny-peng commented 1 month ago

✌️ it's working now. For future reference in case anyone need it, please see the following gist as an example

https://gist.github.com/t0ny-peng/2b4f97a41549012df632941c2deaf9ef

eboasson commented 1 month ago

Wow! Nice!

I am also not certain I would have come up with ddsi_serdata_from_ser_iov(sertype, SDK_DATA, 1U, &data_iovec, data_size), so it is good thing I was otherwise occupied for a few days 🙂

t0ny-peng commented 1 month ago

Sorry I forgot to close this ticket. Thanks for the help!