RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.26k stars 1.26k forks source link

Display camera images in Meldis #18862

Open jwnimmer-tri opened 1 year ago

jwnimmer-tri commented 1 year ago

Meldis should subscribe to lcmt_image_array and display the contained camera images.

The goal would be to match what we did previously with show_image.py in drake-visualizer.

I'm not sure whether it should open its own (Python-native) window, or if it should try to serve a video stream to the browser.

Ideally the architecture would support using the same code along with Meshcat as well in an online notebook, but the victory condition here is at least getting Meldis working well.

jwnimmer-tri commented 1 year ago

To stream images to the browser, I see two options:


(1) multipart/x-mixed-replace

See https://towardsdatascience.com/video-streaming-in-web-browsers-with-opencv-flask-93a38846fe00 for a write-up.

It might be difficult to know whether the browser is keeping up with the content (and have meldis respond by skipping frames). I'm not 100% sure how the buffering works here.

On the plus side, it it would be plausible to use usockets to do this natively in Meshcat. Probably we'd still prototype it first with Flask just to get the UX locked down, then backport to C++ later on.

See also: https://blog.miguelgrinberg.com/post/video-streaming-with-flask https://blog.miguelgrinberg.com/post/flask-video-streaming-revisited


(2) WebRTC

In Python via aiortc (example here). Ubuntu 22.04 has packages for it already. (It uses native code to run fast.)

This actually has a concept of timestamps so we should have the ability to drop frames.

It would be Python-only. I'm not sure a C++ RTC stack would be worthwhile.

jwnimmer-tri commented 1 year ago

To summarize some recent f2f discussions:

To get us started, @zachfang will try to create a simple, standalone program (using Flask) that serves up the http multipart from an LCM subscriber. We'll play with that prototype and then decide next steps.

To generate images, we can just call model_visualizer and turn on the rgbd sensor. It already broadcasts the LCM images.

jwnimmer-tri commented 1 year ago

From f2f: the goal for the first PR will be to land the lcm_image_viewer program (from #19963) on its own, without any Meldis integration. This will still be super useful for users, and is a good stepping stone for full Meldis integration.

jwnimmer-tri commented 10 months ago

I've decided to take my own swing at a prototype, starting with #20482 and doing all of the LCM->PNG handling in C++.