Open jwnimmer-tri opened 1 year ago
To stream images to the browser, I see two options:
(1) multipart/x-mixed-replace
See https://towardsdatascience.com/video-streaming-in-web-browsers-with-opencv-flask-93a38846fe00 for a write-up.
It might be difficult to know whether the browser is keeping up with the content (and have meldis respond by skipping frames). I'm not 100% sure how the buffering works here.
On the plus side, it it would be plausible to use usockets
to do this natively in Meshcat
. Probably we'd still prototype it first with Flask just to get the UX locked down, then backport to C++ later on.
See also: https://blog.miguelgrinberg.com/post/video-streaming-with-flask https://blog.miguelgrinberg.com/post/flask-video-streaming-revisited
(2) WebRTC
In Python via aiortc
(example here). Ubuntu 22.04 has packages for it already. (It uses native code to run fast.)
This actually has a concept of timestamps so we should have the ability to drop frames.
It would be Python-only. I'm not sure a C++ RTC stack would be worthwhile.
To summarize some recent f2f discussions:
pydrake.visualization
that serves LCM images up as an http
stream (or streams). We'll use option (1) to do http multipart. We might need an option to cap the refresh rate in case bufferbloat becomes a problem.To get us started, @zachfang will try to create a simple, standalone program (using Flask) that serves up the http multipart from an LCM subscriber. We'll play with that prototype and then decide next steps.
To generate images, we can just call model_visualizer
and turn on the rgbd sensor. It already broadcasts the LCM images.
From f2f: the goal for the first PR will be to land the lcm_image_viewer
program (from #19963) on its own, without any Meldis integration. This will still be super useful for users, and is a good stepping stone for full Meldis integration.
I've decided to take my own swing at a prototype, starting with #20482 and doing all of the LCM->PNG handling in C++.
Meldis should subscribe to
lcmt_image_array
and display the contained camera images.The goal would be to match what we did previously with show_image.py in
drake-visualizer
.I'm not sure whether it should open its own (Python-native) window, or if it should try to serve a video stream to the browser.
Ideally the architecture would support using the same code along with
Meshcat
as well in an online notebook, but the victory condition here is at least getting Meldis working well.