Meshcat in C++ - Githubissues

mntan3 commented 4 years ago

Issue Description

When visualizing a model with many links, I was seeing that meshcat was slowing down the simulator significantly and the drake visualizer doesn't. It would be nice if this were at least documented if in fact meshcat is just slower than Drake Visualizer.

Example test script and model here to replicate:

https://gist.github.com/mntan3/be02a1c410a0830f2ddb656aaf6403e2 After running the script for 10 seconds, it should print out the simulator rate. I was observing something like a rate of 0.8 for drake visualizer and 0.6 for meshcat on my computer

Initial discussions from slack:

https://drakedevelopers.slack.com/archives/C43KX47A9/p1586538504017800 sean.curtis 2 hours ago I think you may just be victim of python loops vs C++ loops.

mntan 2 hours ago Just to clarify, so you're saying that meshcat is written in python and drake_vis is written in c++, so that's why meshcat is going to be slower?

sean.curtis 2 hours ago Essentially -- there may be other reasons, but that will be one wall you won't be able to get around. And the key, particularly, is the work that has to be done in a Drake System to translate Drake state to be consumed by the visualizer. You might try collecting timing on the publish method of the mesh cat visualizer -- easy enough to do in python. I bet most of the time lost is spent right there.

eric.cousineau 1 hour ago I think this may be good to track in an issue. @mntan Do you feel comfortable porting this to a Drake issue?

eric.cousineau 1 hour ago My guess is it might be slow due to mesh conversion for sending it over the wire?

sherm1 commented 4 years ago

Assigning to @mntan3 for further investigation.

RussTedrake commented 4 years ago

@manuelli said he found that the numpy -> msgpack conversion is extremely slow out of the box (in both directions), but that this package implements a much faster alternative. https://pypi.org/project/msgpack-numpy/

fwiw, I think the right solution is probably to write our mechcat visualizer in c++.

cc @xuchenhan-tri

RussTedrake commented 4 years ago

FTR, Erwin has a thin c++ meshcat zmq code here: https://github.com/google-research/tiny-differentiable-simulator/blob/297dc2232780396c7f63cb8be97a5c8490ebc653/examples/meshcat_zmq.h

RussTedrake commented 3 years ago

Planned resolution is to move to c++.

RussTedrake commented 3 years ago

fwiw -- i'm planning to spike-test a c++ implementation over the next few days.

RussTedrake commented 3 years ago

I'll leave some notes here to document some of the relevant decisions.

Websockets not ZMQ. meshcat-python uses a separate ZMQ server to relay between python and the browser: python Visualizer <=zmq=> zmqserver <=websockets=> browser
I intend to go directly from c++ meshcat to the browser: c++ Visualizer <= websockets => browser Having discussed with @rdeits, the zmq server design was put in place partially to support multiple geometry suppliers (the visualizers) and consumers (the browsers), but also just to parcel out the asyncio complexities away from the supplier in python. His Julia meshcat Visualizer just goes straight to the browser via websockets, and he's been recommending that to me when I upgrade. This is especially relevant because I am trying to add new support for gui elements in the meshcat browser sending information back to c++, and the zmqserver in the middle complicates that workflow significantly.

C++ websocket libraries. I've now explored a handful of websocket libraries that we could potentially use in drake c++. This list was helpful. Taking a number of factors, such as licensing and light dependencies, I ended up looking most closely at:

https://libwebsockets.org/ . This one is available in homebrew and apt, and was trivial to bring into the build system. But even the example code is extremely hard to read and dramatically inconsistent with the style choices we've made in our style guide.
https://github.com/zaphoyd/websocketpp. This looked promising, but the tutorials tapered off mid sentence, and ...
https://github.com/uNetworking/uWebSockets looks like the winner for now. I've got it in the build system and am going to bring up a first version of c++ meshcat using this to start.

RussTedrake commented 3 years ago

Basic C++ design is currently:

Meshcat is a class that plays the role of meshcat.Visualizer in python. It will launch the websocket listener thread and accept set_object, set_transform, etc, calls in the main thread. This will in many way parallel DrakeLcm. I've put this in drake::geometry, because it will support geometry shapes and depend on geometry methods to load meshes, etc.
MeshcatVisualizer will be a LeafSystem that is analogous to DrakeVisualizer (and replace the current drake MeshcatVisualizer implemented in python). It will accept a Meshcat object in the constructor, or offer to own one itself. I will put this in drake::geometry next to DrakeVisualizer; that seems like the right place (it does depend on geometry).
I will also have to port the meshcat.geometry objects.

My current PR strategy is: 1) Meshcat proof of life. Starts the server, demonstrates that clients can connect, and just sends one type of message to show that data can flow. Reviewers can focus on the build system and websocket server details. 2) Bring in testing framework. Requires new build dependencies, which I want to separate from the original PR. 3) Meshcat full api (set_transform, delete, etc). Still with only a modicum of geometry supported. 4) basic MeshcatVisualizer (c++ only, no bindings) implementation. Importantly, I this version will have optional output ports for ui feedback. 5) Add python bindings

Then we can add more geometry / bells and whistles incrementally.

RussTedrake commented 3 years ago

For a more mature testing strategy, I'm currently trying:

Loading meshcat.Viewer() directly in node.js, so that I can provide a test utility that connects to my C++ websocket server and checks for certain conditions to be met (e.g. that set_property('/Background', 'visible', false) has the desired result. This seems the ideal in terms of verifying correctness. It has some immediate challenges in terms of getting meshcat to run headless, which I'm slowing bashing through, and then adding nodejs + npm libraries into the drake test installation framework.

Alternatives could include:

connecting to the websocket (probably from python) and simply verifying that the message is getting through as expected.
testing via headless chrome / chromium (e.g. using puppeteer, which I've used before).

RussTedrake commented 3 years ago

Test strategy update: I've got the following working well with a minimal nodejs setup:

// Test utility for Meshcat that
// 1) Connects a (headless) meshcat Viewer object via websockets to `ws_url`,
// 2) Waits until the Viewer receives `num_messages_to_wait_for` messages 
//    (default: 0),
// 3) Evaluates the string `eval_string`.
// 4) Exits with return code 0 if the `eval_string` evaluates to `true`,
//    otherwise with return code 1.
//
// Run with `node meshcat_test.js ws_url eval_string [num_messages_to_wait_for]`
// e.g. 
//  node meshcat_test.js 'ws://localhost:7001' \
//    "viewer.scene_tree.find(['Background']).object.visible == true" 3
//
// Requires meshcat, and `npm install jsdom webgl-mock-threejs canvas`.

The full script is here: meshcat_test.js

jwnimmer-tri commented 3 years ago

... adding nodejs + npm libraries into the drake test installation framework.

FYI My rough impression from very quick glances in the past was that node and npm were extremely difficult to make sufficiently hermetic for use Drake. You might want to de-risk that before walking too far down this path. Maybe https://github.com/bazelbuild/rules_nodejs has already resolved this by now, but I don't think we know for sure yet.

connecting to the websocket (probably from python) and simply verifying that the message is getting through as expected.

Why is this option not the best answer? We don't acceptance test drake-visualizer round trip, we assume that it has its own testing in place, and so within Drake we just check that the messages we are sending it are as desired. That same story seems like it should be plenty sufficient for meshcat as well? If we find that too many bugs are slipping through, we can always upgrade to a headless regression test in the future.

RussTedrake commented 3 years ago

Tentative plan for transitioning from python to c++:

bind drake::geometry::Meshcat to pydrake.geometry.Meshcat. There is no conflict here.
bind drake::geometry::MeshcatVisualizer to pydrake.geometry.MeshcatVisualizerCpp to avoid any naming conflict, even with the different package paths; it also forces the user to realize that they are getting a different object, which has a slightly different workflow (no zmqserver, etc).
leave pydrake.systems.meshcat_visualizer.MeshcatVisualizer mostly alone for now, until the cpp version achieves feature parity. Probably after I get through porting my notes in this fall semester would be a natural time for the next phase.

Once we feel that feature parity has been reached, we can deprecate the python MeshcatVisualizer; there should be a reasonable way to do this since the constructors will take different arguments: the c++ version will want a Meshcat object passed in, the python version wants a zmq_url, etc. Also, we have the fact that the c++ version will offer AddToBuilder, which DrakeVisualizer switched to, and the python version still uses the original Connect*Visualizer spelling.

RobotLocomotion / drake

Meshcat in C++ #13038

Issue Description

Example test script and model here to replicate:

Initial discussions from slack: