rerun-io / rerun

Visualize streams of multimodal data. Free, fast, easy to use, and simple to integrate. Built in Rust.
https://rerun.io/
Apache License 2.0
7.05k stars 352 forks source link

Add support for HEIF: H264/H265/HEVC compressed images #5815

Open karanchahal opened 8 months ago

karanchahal commented 8 months ago

My problem is I need to visualize multiple image data streams that are each of size 1920 x 1200 on the browser over a network connection. The current rerun support is quite spotty in this regard. It is laggy and the program kills itself after consuming some amount of RAM.

Describe the solution you'd like Support for sending in h264 encoded image streams should be supported in Rerun and the viewer automatically decoding them on the frontend and displaying it. Describe alternatives you've considered

Other approaches like reducing size of image is not a valid solution. Additional context This is a blocker to use this product in places where we need to see live robot visualization in a way that we can use teleoperation seemlessly. Perform calibration in an interactive manner etc.

I am doing this right now:

rr.log("image", rr.Image(cv2.resize(cv_image, (640, 480))))

I can see the image but at around 10 Hz which is not acceptable for our applications.

Wumpf commented 8 months ago

The browser version of Rerun is indeed particularly limited today: the viewer tries to keep everything uncompressed in ram and drops data once a pre-configured memory budget is reached. In the browser this even less viable than elsewhere since the memory budget is <4gb by design. Due to the continuous purging of old data when hitting the memory limit we should in theory not run out of memory, but I believe this could be the same issue as

(to be likely fixed in 0.16)

Generally, we're hoping to soon tackle the memory issue as part of

which is currently how we believe we'll be able to handle video streams and other larger than ram data in the future.

That leaves the really bad performance of ingesting even smaller images on the web for which I don't have a good answer right now, this is something we'll have to investigate and see what can be done about it. It's definitely not expected to be that bad, but I also don't think we did a lot of life streaming experiments with the web viewer.

I reckon for your usecase the native viewer is not an option? Could you (if possible) go a bit more into the details on why the web version of the viewer is what you need here?

karanchahal commented 8 months ago

I think it was bad because i was streaming uncompressed images of size 1920 x 1200.

it's much better if I add Image(<>).compress(). Its around 15 fps and if I bump down the image resolution to 640 x 480, I see around 30 Fps, which is much better.

But again, I think adding h264 support (maybe you can add an example of taking ros image data, running it through an image transport to encoded it to h264, stream it over the network, have rerun web visualizer be capable of hooking into a h264 stream (maybe HLS or WebRTC- not sure what is the best method to support this)) to rerun's open source offering will really help convince teams to onboard to this. Thanks very much ! Because there are a lot of applications where you need to visualize stuff running live on the robot and not from a ros bag.

Although, I do think this is not a trivial amount of work.

I reckon for your usecase the native viewer is not an option?

Yes, it is not an option, because viewers want to be able to view robot feeds on completely different devices and not hog system memory on the robot. Plus its easier to load it on the browser.

Wumpf commented 7 months ago

able to view robot feeds on completely different devices

just in case you missed it: the sdk connects to the native viewer via tcp, so the viewer can run on any other device that's on the network. There's no requirement to have it run on the same device as the sdk

karanchahal commented 7 months ago

Oh ! I missed that, thanks for pointing that out. Ill dig through the documentation and try to hook that up instead. Hmm, maybe this will relieve the browser limitations for my testing but I still think browser is probably the way to go for widespread adoption on mobile devices.

Thanks ! On Wed, Apr 10, 2024 at 12:44 PM Andreas Reich @.***> wrote:

able to view robot feeds on completely different devices

just in case you missed it: the sdk connects to the native via tcp, so the viewer can run on any other device that's on the network. There's no requirement to have it run on the same device as the sdk

— Reply to this email directly, view it on GitHub https://github.com/rerun-io/rerun/issues/5815#issuecomment-2048311215, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEXT7VPUE3TPRJSSYDIWF3Y4WJA3AVCNFSM6AAAAABFZPPOJOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBYGMYTCMRRGU . You are receiving this because you authored the thread.Message ID: @.***>

simgt commented 7 months ago

I'm using rerun to live stream data from a gstreamer pipeline, and being able to send h264 encoded frames would be extremely useful in that case. With the latest version that displays the size of a run, I realised that a 60MB video can take up to 2.5GB even though I'm downsizing the frames before sending them as Image, it's a bit of a waste.

Is there some notes on how this could be implemented?

I guess the main downside is that moving through the timeline would not be as instantaneous as now (which is amazing, by the way!).

karanchahal commented 7 months ago

Yeah that a good point, I have a file that sends 640 x 480 images to the frontend and this captures around the same amount in a minute or so.

houqp commented 3 months ago

We have the same use-case, supporting encoded video frame is main blocker for adoption due to high resource usage, otherwise rerun checks all the boxes for us.

nikolausWest commented 3 months ago

@simgt, @houqp could you expand a bit on your use-case? We are currently working on supporting video encoded data but won't support every possible permutation of SDK API, codec, etc in the first release. Do you already have h264 / other codec encoded frames (i.e. explicit key-frames and b-frames) in memory in your code or do you have e.g. chunks of encoded video (perhaps even in mp4 containers)? Or do you even have image tensors in memory and also want help encoding to videos before logging / sending?

simgt commented 3 months ago

Hi @nikolausWest – My gstreamer pipelines are doing inference on live cameras and recording data for training purpose, so I have both encoded or unencoded frames available and either would work. They are running in the wild on NVidia Jetsons or similar boards. From highest to lowest priority:

In my case, giving already encoded frames to the sdk would be by far the most efficient.

emilk commented 3 weeks ago

I wonder if we can use ffmpeg for this 🤔

Wumpf commented 3 weeks ago

It's supported starting in ffmpeg 7.0 https://github.com/FFmpeg/FFmpeg/blob/master/Changelog#L79