Display camera images in Meldis

jwnimmer-tri commented 1 year ago

Meldis should subscribe to lcmt_image_array and display the contained camera images.

The goal would be to match what we did previously with show_image.py in drake-visualizer.

I'm not sure whether it should open its own (Python-native) window, or if it should try to serve a video stream to the browser.

Ideally the architecture would support using the same code along with Meshcat as well in an online notebook, but the victory condition here is at least getting Meldis working well.

jwnimmer-tri commented 1 year ago

To stream images to the browser, I see two options:

(1) multipart/x-mixed-replace

See https://towardsdatascience.com/video-streaming-in-web-browsers-with-opencv-flask-93a38846fe00 for a write-up.

It might be difficult to know whether the browser is keeping up with the content (and have meldis respond by skipping frames). I'm not 100% sure how the buffering works here.

On the plus side, it it would be plausible to use usockets to do this natively in Meshcat. Probably we'd still prototype it first with Flask just to get the UX locked down, then backport to C++ later on.

(2) WebRTC

In Python via aiortc (example here). Ubuntu 22.04 has packages for it already. (It uses native code to run fast.)

This actually has a concept of timestamps so we should have the ability to drop frames.

It would be Python-only. I'm not sure a C++ RTC stack would be worthwhile.

jwnimmer-tri commented 1 year ago

To summarize some recent f2f discussions:

Drake does not aim to be a fully-fledged robotics GUI. There are other and better tools in the ecosystem to meet that need (e.g., https://foxglove.dev/studio, Rviz, etc).
What we do aim for built-in to Drake are a minimum set of visualization features that allow for basic debugging even in the absence of those tools.
Under that premise, the goal here will be a simple Python class that's importable from pydrake.visualization that serves LCM images up as an http stream (or streams). We'll use option (1) to do http multipart. We might need an option to cap the refresh rate in case bufferbloat becomes a problem.
Meldis will properly instantiate and manage this class, so that we have image visualization ready to go easily.

To get us started, @zachfang will try to create a simple, standalone program (using Flask) that serves up the http multipart from an LCM subscriber. We'll play with that prototype and then decide next steps.

To generate images, we can just call model_visualizer and turn on the rgbd sensor. It already broadcasts the LCM images.

jwnimmer-tri commented 1 year ago

From f2f: the goal for the first PR will be to land the lcm_image_viewer program (from #19963) on its own, without any Meldis integration. This will still be super useful for users, and is a good stepping stone for full Meldis integration.

jwnimmer-tri commented 10 months ago

I've decided to take my own swing at a prototype, starting with #20482 and doing all of the LCM->PNG handling in C++.

RobotLocomotion / drake

Display camera images in Meldis #18862