ros / ros_comm

ROS communications-related packages, including core client libraries (roscpp, rospy, roslisp) and graph introspection tools (rostopic, rosnode, rosservice, rosparam).
http://wiki.ros.org/ros_comm
753 stars 910 forks source link

Proposal: message to support both Protobuf and original ROS message #1085

Open quning18 opened 7 years ago

quning18 commented 7 years ago

Hi, All,

I did some research and find out there were some discussion/research on this topic, as links listed below:

http://wiki.ros.org/sig/NextGenerationROS/MessageFormats https://discourse.ros.org/t/optional-fields-in-message/991/9 http://design.ros2.org/articles/serialization.html

Protobuf did has its pro side and people has intention to use it somehow in ROS message. I would like to raise this discussion again to see if we can have some common ground to move further.

For Protobuf, we are using it in a large project with a lot of pieces moving very quickly. So the rosbag was used to be a big problem because when we add new field in the message, it will fails on all the previous collected rosbag. We can convert them, but conversion itself is time consuming, not mentioning the complexity of version control. That's one of top reason we would like to use protobuf.

However, we know there are some drawbacks for probobuf mentioned in the first link listed above. So, we are thinking maybe it's a good idea to have ROS support both of them natively. By natively, I mean, just like the current ros message, people can pub the message in pb or original ros msg format, and the subscriber will adapt to it depending on the published format. Also, all the tools can support both of them as well.

So, with large raw data, such as some sensor data, we can still keep them in original ros message format. for other control plane message, they can be in pb format, so any future changes won't break any run on previous collected rosbag data.

Any idea if this is acceptable?

wjwwood commented 7 years ago

@quning78 I just wanted to let you know that we're not ignoring you, we're just all really busy at the moment and it might be a few days before we get around to responding.

quning18 commented 7 years ago

Thanks for getting back to me. It's not urgent at this point. We can further discuss this when you guys are available.

YuehChuan commented 7 years ago

@quning78 sounds reasonable. And I just found capnproto these day, claim faster than protobuf https://capnproto.org

dirk-thomas commented 7 years ago

The general idea of supporting changed message definitions better is certainly appealing. As I understand the proposal I see a potential drawback of introducing a separate wire format like Protobuf beside the existing ROS message format though. Literally every pieces of code would need to handle both types. Due to the enormous amount of code that is not really realistic so it is more likely that only some parts will be updated to support both and some newly written code might even only support Protobuf. That kind of difference has the potential to divide the ROS ecosystem of packages in way that it becomes difficult to decide which parts of ROS work with which other parts of ROS.

Is the message format negotiated somewhere in the protocol or do both sides just have to use the same? Could an automatic conversion help to mitigate some of the problems?

I am not sure how much of the goal can be achieved without too much of the potential downside. I am open to hear proposals how this could look like and what the pros and cons would be specifically.

NikolausDemmel commented 6 years ago

I just wanted to weigh in that we had the issue of changing message definitions in a rapidly evolving project and the resulting need to often migrate a large amount of bag files of datasets. It has caused much headache to the point where we were collecting updates to the message definition to apply them all in one go every now and then.

I'm not sure what the best solution given the current situation would be. I agree with @dirk-thomas's assessment. Maybe one option would be to enable more tools (rosbag play, C++ API bag reading, Python API bag reading) to apply migration rules on the fly. Of course that will incur a performance cost at the time of reading, but then people can trade off the effort to migrate the bags vs decreased read / play performance. Unfortunately, I guess the current migration rules format is not very well suited, for this, since it is based on executable python scripts and doesn't use a declarative format that could easily be used from C++ as well. So maybe an alternative migration rule format would be a prerequisite. The upside is, that only rosbag code would need to change.

Just my two cents. I'm not planning to actively work on this.

ammarhusain commented 6 years ago

Hi all, I have been burned by message versioning issues a few times in the past. For my last couple of projects I have written a ROS Protobuf bridge that enables using Protobuf msg datatypes with ROS comms. It enables me to publish, subscribe, log & introspect protobuf containers as seamless as other ROS messages. I have published a bare bones example here: https://github.com/ammarhusain/ros-protobuf-bridge

It just uses the basic ROS message traits for Python & C++ and fulfills them with the appropriate Protobuf API.

I am happy to work on contributing this back to the wider community. Thoughts on how it could be integrated or made available?

harlowja commented 6 years ago

Has their been any update on this? (or anything like it); seems pretty good to have.

jwhendy commented 6 years ago

Bumping this as well. I'm not super familiar with rosbag, and only recently discovered things like this and this on ROS Answers that highlight the issues that can arise as messages change, or evens migrating from one ROS distro to another.

We're in a similar situation to the OP where we are going to generate a log of data, and rosbag seemed convenient to just capture everything from within ROS, store it, and sort through what we want to extract later. Migration is a bit terrifying on the data volumes we're talking about, and if we pursue cloud storage, pulling data back down to migrate is a huge cost/productivity penalty.

Is there any recommendation on this? For my use case, and maybe food for thought in general:

Thanks for any updates/input.

NikolausDemmel commented 6 years ago

I'm not a maintainer, but as my earlier comment indicate, I agree with the issues you raise.

However, I believe there are really multiple independent things discussed in this issue.

1) protobuf message serialization; this has advantages, but is quite intrusive and it is unclear if support can be added seamlessly. Not sure how useful it is to a larger audience if everyone would have to adapt their code to work with it. 1) Reading old data from bags after message definitions have changed. Now AFAIKT it should be perfectly feasible to do that by changing rosbag python and C++ API implementations and also rosbag play to handle this case and apply some migration rules on the fly. The charm of this is that no changes outside rosbag are needed and all other code would "just work". There would have to be some kind of new migration format (independent from the current definitions that are in any case only used for the offline migration script as far as I know) as discussed in my earlier post. I'm not sure what the best approach would be. It is of course feasible that protobuf might be used here. Maybe one could make rosbag be backed by protobuf storage and have some automatic conversion magic between ros messages and protobuf messages. But to take advantage of protobufs backwards compatibility, you would probably still need some extra rule definitions, since for example message fields in ROS cannot be numbered explicitly like in protobuf. But I guess this can also be perfectly achieved without protobuf.

PS: If I'm not mistaken, I think with the Python API it is already possible to read messages where you don't have the message definition or where they are outdated, since it can take the message definition as storied inside the bag.

jwhendy commented 6 years ago

@NikolausDemmel thanks for the input. Indeed, I'm primarily interested in 2, but would consider 1 if this was a solution to those issues. I debated posting on ROS Answers, but opted for here since it seemed close and this seemed more like an official philosophical question vs. "How do I" question.

I think with the Python API it is already possible to read messages where you don't have the message definition or where they are outdated

Interesting. I was trying to break a rosbag last night and couldn't do it, but I was using python. I stored a bag, changed the def of my message to add a new field, catkin build, then read the bag and played it. No issues.


It would be nice if there was a reproducible breakage example people could see, as using rosbag without understanding limitations seems like it could have large consequences depending on the scenario. I see the page on migration, but have not seen a concrete list of when this is needed, all the cases where something goes wrong, and a tutorial on fixing (and how long/much computation that would take for very large data quantities).

I can create a separate issue requesting this if that's desired?

mikepurvis commented 6 years ago

Yes, the rosbag Python API generates message definitions on the fly based on what's stored in the bag file itself. This can be a source of headaches as the type of such a message instance is not a match for the regular message type, and you also can't pickle/unpickle those types. You'll also run into trouble (obviously) if you try to play those messages into a binary which uses updated definitions— the MD5s won't match and the subscription will fail.

However, this scheme is what enables bag migration to work, since you can have an active python environment with awareness of both the old and new type. See: http://wiki.ros.org/rosbag/migration

mcsheehan commented 5 years ago

Additional to the above comments proto files have support for optional fields, which are therefore back compatible if additional fields are added to ros messages in the future. They perform python, java and c++ message definitions including getters and setters out of the box.

lucasjinreal commented 5 years ago

So, after 2 years later, any updates on this issue?

huiyi1990 commented 5 years ago

It's truly bother me a lot. one thing we can do is add string type to save serialized message

GitBubble commented 4 years ago

protobuf messaging was taken by nvidia and other open source middleware vendors. I think the key issue has metioned in the above comments. We need a tool or some bridge stuff.

jviotti commented 1 week ago

Old thread, but just wanted to mention I'm personally working on the problem of space-efficient data transfer in robotics at Sourcemeta (https://www.sourcemeta.com) for saving on 5G/satellite costs.

We are working hard getting JSON BinPack to production, a binary serialization format that can be up to 74% more space-efficient than Protocol Buffers (see a benchmark here: https://arxiv.org/abs/2211.12799), while supporting both schema-less (like MessagePack/CBOR) and schema-driven (with JSON Schema) forms.

We are actively seeking for feedback and alpha users from the ROS community, so please reach out! 🙏🏻