ros2 / rmw_cyclonedds

ROS 2 RMW layer for Eclipse Cyclone DDS
Apache License 2.0
117 stars 90 forks source link

CDR wstring serialization is nonstandard #43

Open rotu opened 5 years ago

rotu commented 5 years ago

DDS-XTypes v1.2 – defines the DDS Type System and the serialized representation of DDS data.

Here's what XTypes says about strings, which I think we do a poor job of following:

Objects of String<Char8> type shall be represented using the UTF-8 character encoding. The serialized length of an object of type String shall be the number of bytes in the CDR buffer taken by the String<Char8> characters, including the terminating NUL character. The serialized length may not be the same as the number of Unicode characters because a single Unicode character encoded using the UTF-8 encoding may take one to four bytes.

Objects of String<Char16> type shall be represented using the UTF-16 character encoding. The serialized length of an object of type String<Char16> shall be the number of bytes in the CDR buffer taken by the String<Char16> characters. This is twice the number of characters in the string because a single character (in the Basic Multilingual Plane) encoded using UTF-16 takes 2 bytes to serialize.

The UTF-16 representation of object of type String<Char16> shall not include a Byte Order Mark (BOM). The representation shall also not include any terminating NUL character(s).

Fixing this issue will likely break compatibility with Fast-RTPS (https://github.com/ros2/rmw_connext/issues/364) but fix compatibility with Connext (https://github.com/ros2/build_farmer/issues/268, #121)

_Originally posted by @rotu in https://github.com/ros2/rmw_cyclonedds/pull/42/review_comment/create_

wjwwood commented 5 years ago

@rotu can you clarify if you think this is an issue with ROS 2's mapping to DDS types in general, or just for rmw_cyclonedds?

@eboasson do you have time to investigate this? We can get involved if it is broader in scope, but otherwise we're likely to be in the way.

rotu commented 5 years ago

@rotu can you clarify if you think this is an issue with ROS 2's mapping to DDS types in general, or just for rmw_cyclonedds?

I don't know if it's an issue with ROS2 or just cyclone. Here are the bugs:

  1. We use wstring, wchar_t, (UTF32 or UTF16 depending on platform). The problem is we serialize it according to the platform-native wchar_t.
  2. We use as the length of the string wstring::size, i.e. the number of characters. It should be the number of bytes instead.
dirk-thomas commented 5 years ago

The ROS 2 code (not specific to a RMW impl) doesn't use wstring / wchar_t (since their size if platform dependent) but uses u16string / char16_t and expects UTF-16 encoded data.

rotu commented 5 years ago

Does the ROS2 code have any concept of the "correct" serialization of the data? Or is the IDL-to-CDR part entirely within the RMW implementation?

dirk-thomas commented 5 years ago

The serialization is entirely up to the RMW impl. But obviously only when they follow a standard (like XTypes 1.2) different implementation will be compatible on the wire.

eboasson commented 5 years ago

The current behaviour of rmw_cyclonedds simply matches Fast-RTPS to make the wstring/wchar interoperable and the tests in test_communication pass. Implementing the representation mandated by XTypes would be preferable as far as I am concerned.

So I guess the question is: how valuable is wstring/wchar being interoperable between the rmw_fastrtps and rmw_cyclonedds today? If the consensus is not to worry about that, then I'd say we treat this as a bug in rmw_cyclonedds and fix it.

dirk-thomas commented 4 years ago

I think this has changed in the DDS / XTypes spec in the past (from platform specific wchar to 16-bit char). It could be that multiple RMW impl. are still using the past spec?

Any changes to this should probably be coordinated with the multiple vendors to ensure that they still work together.

@rotu How is #42 related to this ticket? What is your goal for this ticket: just clarify what the current status is or to update the RMW impl. to match the latest spec?

rotu commented 4 years ago

It's possible that it's based on a past spec, or the spec used to not be clear. At any rate, the CDR representation has a version byte in the encapsulation header, so it probably makes sense to leave it this way for XCDR.eversion = VERSION_NONE, and switch to the correct behavior at the same time we bump that tag to VERSION2.

42 is not fixing this issue.

I created the ticket to document this issue for later. Not sure if/when I'll take a whack at it.

rotu commented 4 years ago

Rewrote to make issue clearer.

rotu commented 4 years ago

https://github.com/ros2/rmw_cyclonedds/issues/121#issuecomment-612284179

@rotu What is the status of this? Is there any work towards addressing the failing test ongoing? If yes, in which PR and when is it expected to land? If not, I propose to disable the test with a reference to the issue.

The status of this is that I'm not actively working on a fix and it won't be done within the next few months unless someone wants to run with it. Interoperability is currently a mess, and I don't expect wstrings to work between two computers running CycloneDDS and/or FastRTPS even on different operating systems.

I don't see this as urgent. In DDS-XTypes 1.1, the (non-wide) string character encoding was ISO-8859-1 (Latin Alphabat no. 1), and so it couldn't support international characters, but in 1.2 it was changed to UTF-8, so it could. That kills the main need for WString.

Fixing this would also introduce an incompatibility with the default RMW, Fast-RTPS. So to me it looks like more downside than up.

ivanpauno commented 4 years ago

Fixing this issue will likely break compatibility with Fast-RTPS (ros2/rmw_connext#364) but fix compatibility with Connext (ros2/build_farmer#268, #121)

Currently, Fast-RTPS is inter-operating well with both CycloneDDS and Connext, except in macOS where Wstring tests against Connext are disabled. See https://github.com/ros2/system_tests/blob/4ccd6e3c751c9332ed8488e85e8c789e80238b39/test_communication/CMakeLists.txt#L213-L237.

I'm not sure of what's the trick, but it sounds like there's a way to maintain compatibility with both.

rotu commented 4 years ago

That’s because on these platforms, wchar_t happens to be 2 bytes. It must also mean Connext is using the number of UTF-16 code units as a length rather than the bytelength. (that or someone is doing something funny like tweaking the format based on implementation identifiers)