RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.34k stars 1.26k forks source link

Meshcat in C++ #13038

Closed mntan3 closed 3 years ago

mntan3 commented 4 years ago

Issue Description

When visualizing a model with many links, I was seeing that meshcat was slowing down the simulator significantly and the drake visualizer doesn't. It would be nice if this were at least documented if in fact meshcat is just slower than Drake Visualizer.

Example test script and model here to replicate:

https://gist.github.com/mntan3/be02a1c410a0830f2ddb656aaf6403e2 After running the script for 10 seconds, it should print out the simulator rate. I was observing something like a rate of 0.8 for drake visualizer and 0.6 for meshcat on my computer

Initial discussions from slack:

https://drakedevelopers.slack.com/archives/C43KX47A9/p1586538504017800 sean.curtis 2 hours ago I think you may just be victim of python loops vs C++ loops.

mntan 2 hours ago Just to clarify, so you're saying that meshcat is written in python and drake_vis is written in c++, so that's why meshcat is going to be slower?

sean.curtis 2 hours ago Essentially -- there may be other reasons, but that will be one wall you won't be able to get around. And the key, particularly, is the work that has to be done in a Drake System to translate Drake state to be consumed by the visualizer. You might try collecting timing on the publish method of the mesh cat visualizer -- easy enough to do in python. I bet most of the time lost is spent right there.

eric.cousineau 1 hour ago I think this may be good to track in an issue. @mntan Do you feel comfortable porting this to a Drake issue?

eric.cousineau 1 hour ago My guess is it might be slow due to mesh conversion for sending it over the wire?

sherm1 commented 4 years ago

Assigning to @mntan3 for further investigation.

RussTedrake commented 4 years ago

@manuelli said he found that the numpy -> msgpack conversion is extremely slow out of the box (in both directions), but that this package implements a much faster alternative. https://pypi.org/project/msgpack-numpy/

fwiw, I think the right solution is probably to write our mechcat visualizer in c++.

cc @xuchenhan-tri

RussTedrake commented 4 years ago

FTR, Erwin has a thin c++ meshcat zmq code here: https://github.com/google-research/tiny-differentiable-simulator/blob/297dc2232780396c7f63cb8be97a5c8490ebc653/examples/meshcat_zmq.h

RussTedrake commented 3 years ago

Planned resolution is to move to c++.

RussTedrake commented 3 years ago

Related to https://github.com/RussTedrake/manipulation/issues/145

RussTedrake commented 3 years ago

fwiw -- i'm planning to spike-test a c++ implementation over the next few days.

RussTedrake commented 3 years ago

I'll leave some notes here to document some of the relevant decisions.

Websockets not ZMQ. meshcat-python uses a separate ZMQ server to relay between python and the browser: python Visualizer <=zmq=> zmqserver <=websockets=> browser
I intend to go directly from c++ meshcat to the browser: c++ Visualizer <= websockets => browser Having discussed with @rdeits, the zmq server design was put in place partially to support multiple geometry suppliers (the visualizers) and consumers (the browsers), but also just to parcel out the asyncio complexities away from the supplier in python. His Julia meshcat Visualizer just goes straight to the browser via websockets, and he's been recommending that to me when I upgrade. This is especially relevant because I am trying to add new support for gui elements in the meshcat browser sending information back to c++, and the zmqserver in the middle complicates that workflow significantly.

C++ websocket libraries. I've now explored a handful of websocket libraries that we could potentially use in drake c++. This list was helpful. Taking a number of factors, such as licensing and light dependencies, I ended up looking most closely at:

RussTedrake commented 3 years ago

Basic C++ design is currently:

My current PR strategy is: 1) Meshcat proof of life. Starts the server, demonstrates that clients can connect, and just sends one type of message to show that data can flow. Reviewers can focus on the build system and websocket server details. 2) Bring in testing framework. Requires new build dependencies, which I want to separate from the original PR. 3) Meshcat full api (set_transform, delete, etc). Still with only a modicum of geometry supported. 4) basic MeshcatVisualizer (c++ only, no bindings) implementation. Importantly, I this version will have optional output ports for ui feedback. 5) Add python bindings

Then we can add more geometry / bells and whistles incrementally.

RussTedrake commented 3 years ago

For a more mature testing strategy, I'm currently trying:

Alternatives could include:

RussTedrake commented 3 years ago

Test strategy update: I've got the following working well with a minimal nodejs setup:

// Test utility for Meshcat that
// 1) Connects a (headless) meshcat Viewer object via websockets to `ws_url`,
// 2) Waits until the Viewer receives `num_messages_to_wait_for` messages 
//    (default: 0),
// 3) Evaluates the string `eval_string`.
// 4) Exits with return code 0 if the `eval_string` evaluates to `true`,
//    otherwise with return code 1.
//
// Run with `node meshcat_test.js ws_url eval_string [num_messages_to_wait_for]`
// e.g. 
//  node meshcat_test.js 'ws://localhost:7001' \
//    "viewer.scene_tree.find(['Background']).object.visible == true" 3
//
// Requires meshcat, and `npm install jsdom webgl-mock-threejs canvas`.

The full script is here: meshcat_test.js

jwnimmer-tri commented 3 years ago

... adding nodejs + npm libraries into the drake test installation framework.

FYI My rough impression from very quick glances in the past was that node and npm were extremely difficult to make sufficiently hermetic for use Drake. You might want to de-risk that before walking too far down this path. Maybe https://github.com/bazelbuild/rules_nodejs has already resolved this by now, but I don't think we know for sure yet.

connecting to the websocket (probably from python) and simply verifying that the message is getting through as expected.

Why is this option not the best answer? We don't acceptance test drake-visualizer round trip, we assume that it has its own testing in place, and so within Drake we just check that the messages we are sending it are as desired. That same story seems like it should be plenty sufficient for meshcat as well? If we find that too many bugs are slipping through, we can always upgrade to a headless regression test in the future.

RussTedrake commented 3 years ago

Tentative plan for transitioning from python to c++:

Once we feel that feature parity has been reached, we can deprecate the python MeshcatVisualizer; there should be a reasonable way to do this since the constructors will take different arguments: the c++ version will want a Meshcat object passed in, the python version wants a zmq_url, etc. Also, we have the fact that the c++ version will offer AddToBuilder, which DrakeVisualizer switched to, and the python version still uses the original Connect*Visualizer spelling.