rerun-io / rerun

Visualize streams of multimodal data. Free, fast, easy to use, and simple to integrate. Built in Rust.
https://rerun.io/
Apache License 2.0
6.63k stars 335 forks source link

Support extraction of rrd file content into folders/images/json #6011

Open rgolovanov opened 7 months ago

rgolovanov commented 7 months ago

Currently the only one way to see data stored in rrd file is to open it in Visualizer. It might be useful to be able to extract content into raw individual files.

nbkn865 commented 1 month ago

Hello, I also am trying to extract data from an .rrd file, specifically RGB and other images. Is it true that the SDK API doesn't have this functionality yet? If so, what is the status of this user request?

Wumpf commented 1 month ago

@nbkn865 There's ongoing work to expose APIs to access the datastore/loaded-rrd files which will land in the next release!

Specifically for anything logged as images or a blob the viewer already has an option to save those when selected (look for "save blob").

Wumpf commented 3 weeks ago

0.19 supports now reading out the data store which allows to extract the data in code and store it into different files as needed. What's now missing is to offer this out of the box

that said, it's a bit open ended at this point since there's ofc no clear definition of "generate files from this rrd". @rgolovanov can you describe concrete usecases you had in mind?

I'd like to split this up into concrete files that should be either extractble from ui or code (:

nbkn865 commented 1 week ago

@Wumpf, thanks for replying. I indeed am using version 0.19 now with the Data API to extract data from Rerun files, and it works great.

I do have a follow-up question about that, though, where I'm trying to extract images, for example:

import rerun as rr

# load the recording
recording = rr.dataframe.load_recording(my_rrd_file)

image_view = recording.view(
    index="capture_time", contents="/rgb_data"
)
images = image_view.select().read_all()

I see that the images are stored like so in table images:

/rgb_data:Blob: [[[[255,216,255,224,0,...,164,191,35,255,217]]],[[[255,216,255,224,0,...,212,157,143,255,217]]],...,[[[255,216,255,224,0,...,162,226,177,255,217]]],[[[255,216,255,224,0,...,81,112,71,255,217]]]]
/rgb_data:MediaType: [[["image/jpeg"]],[["image/jpeg"]],...,[["image/jpeg"]],[["image/jpeg"]]]

They're all 1-D lists, and there don't seem to be any image dimensions in the data, so how does the Rerun viewer process those 1-D lists back into an RGB array for display?

I tried searching the documentation and couldn't find anything related to the above question.

Wumpf commented 1 week ago

@nbkn865 since those are EncodedImage, i.e. jpeg blobs in your case, size information is part of the jpeg definition It's a different story with Image which is actual "raw" pixel data which always has to be accompanied by an ImageFormat which contains size among other things.