RobotLocomotion / director

A robotics interface and visualization framework, with extensive applications for working with http://drake.mit.edu
BSD 3-Clause "New" or "Revised" License
178 stars 86 forks source link

ddLCMSubscriber may not preserve (channel, message) ordering if messages are published close together #556

Open EricCousineau-TRI opened 7 years ago

EricCousineau-TRI commented 7 years ago

I was writing a quick modification to Siyuan's show_frames.py script to accept multiple channels (as what the drakevisualizer.py in director does) by subscribing to 'DRAKE_DRAW_FRAMES.*' (with callbackNeedsChannel = True).

However, I noticed that occasionally the channel name and the message data would sometime get mixed. If I added sleep for about ~100ms in between publishing on the different channels,

I will briefly try to reproduce this in one or two independent scripts.

As a workaround, I will try to add a command-line argument (or something else) to register each subscriber independently, rather than relying on a single subscriber with multiple channel names.

EricCousineau-TRI commented 7 years ago

Added some reproduction scripts (see repro.sh for running): https://github.com/EricCousineau-TRI/repro/tree/13fa51b/bug/director_lcm_sub

Here's the portion of output.txt that effectively captures the issue:

+ ./pub.py
...
pub 1
channel: DRAKE_DRAW_FRAMES_A
  name: b
timestamp: 1
...
+ ./pub.py --do_sleep
...
pub 1
channel: DRAKE_DRAW_FRAMES_A
  name: a
  timestamp: 1
channel: DRAKE_DRAW_FRAMES_B
  name: b
  timestamp: 1
patmarion commented 7 years ago

Yes this is a known issue, thanks for reporting. I can't remember if there is already an open issue. I have run into the exact problem when implementing a ros tf visualization that subscribed to * channel but I implemented a workaround, based on independent subscribers as you suggested. But the issue should just be fixed in ddLCMSubscriber implementation ideally.

EricCousineau-TRI commented 7 years ago

Sounds good! I will do the workaround for the time being, and possibly update the underlying subscriber if it comes to that point. Thanks!

patmarion commented 7 years ago

I think if you can also workaround the issue by doing this:

sub.setNotifyAllMessagesEnabled(True)

For very high frequency messages this may have performance issues, but most of the time flag is fine. Docs: https://github.com/RobotLocomotion/director/blob/master/src/app/ddLCMSubscriber.h#L77

patmarion commented 7 years ago

I think a reasonable fix in ddLCMSubscriber would change mLastMessage from a QByteArray to a QMap<QString, QByteArray> and the function signature QByteArray getNextMessage(int timeout) would be modified to QByteArray getNextMessage(int timeout, const QString& channel=QString()).

EricCousineau-TRI commented 7 years ago

I think if you can also workaround the issue by doing this:

Yup, that did the trick! Is #446 the related issue you were mentioning?

patmarion commented 7 years ago

That's related but not the same as what you reported. I guess it's never been posted in Github issues. I found where it was previously encountered though, and maybe we just discussed it on Slack:

https://github.com/RobotLocomotion/director/blob/master/src/python/director/treeviewer.py#L426