IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.6k stars 4.83k forks source link

Saving the raw depth data into avi or any video format #3665

Closed madanmch closed 5 years ago

madanmch commented 5 years ago

Required Info
Camera Model D400
Firmware Version (Open RealSense Viewer --> Click info)
Operating System & Version Linux (Ubuntu 16)
Kernel Version (Linux Only) (e.g. 4.14.13)
Platform PC/Raspberry Pi/ NVIDIA Jetson / etc..
SDK Version { legacy / 2.<?>.<?> }
Language {C/C#/labview/nodejs/opencv/pcl/python/unity }
Segment {Robot/Smartphone/VR/AR/others }

Issue Description

We are trying to store D435 depth data for few hours of recording. We can able to store both color frames as well as depth data (in OpenCV mat) format in avi file format. Here the problem is, depth data occupies few gigabytes of data for 30 minutes when color information is in the order of Mega bytes.

For saving memory while storing the depth data, we are trying to apply opencv applycolormap and stored into avi video file (here we chosen avi video format, but we are not particular of this format, anything is fine for us). And when we tried to apply reverse colormap opencv algorithm on the stored avi file, our program crashes. This could because of not getting bgr data from the store video.

We are trying to use this link for applying reverse colormap.

Has anyone tried to store depth data with less storage (without using bag or csv file format)? Has any examples where can we look.

dorodnic commented 5 years ago

Hi @madanmch This is obviously your decision, but I would strongly advice against this approach - Even if you reverse color map, any conventional video codec assumes it is safe to interpolate pixel values. Interpolating between a depth of an object and depth of its background will introduce some significant artifacts... For bag recording we do apply LZ4 lossless compression, hence the relatively large file size. You could try decimating depth to 8 bits (instead of default 16 that are not fully used) and then manually applying some sort of lossless compression.

jkenney9a commented 5 years ago

Hi @madanmch

I'm actually working to achieve something similar, and have succeeded a bit, but not entirely happy just yet. Nonetheless, I'll post here with what I've had success doing. I too have been taking a recorded bag file and breaking it into its depth/color images. I also want to get the actual distance data at each pixel. I've used code from here to put things together.

The best I've found so far is to get the aligned depth data, convert it to 8-bit integer and append into an updating npy file: depth = np.asanyarray(depth_frame_aligned.get_data()).astype('uint8') np.save(f_handle, depth)

I do this at the same time as saving the color and depth images into separate compressed video files. I didn't want to lose depth info (e.g what @dorodnic said above). However, this is not, strictly speaking, the depth info, you need to multiply it by a scaling factor:

depth_scale = profile.get_device().first_depth_sensor().get_depth_scale()

Problem is, to do this everything needs to be converted to a float, which makes the file sizes a lot bigger. My thinking is to save the data as an 8-bit integer, and then when I go to process it i later, I'll multiply it by the scaling factor as needed.

This still results in a fairly big file size, I'm currently exploring ways to store this as an hdf5 file or compressing in some way, so this certainly isn't the last word! If/when I sort this out, happy to share.

RealSenseCustomerSupport commented 5 years ago

Hi madanmch, Wonder if you get any update on this one, given what dorodnic and jkenney9a commented/shared?

Thanks!

madanmch commented 5 years ago

Sorry, I haven't found a solution from this thread. I am not currently storing depth information. Still, I am waiting for the solution

jkenney9a commented 5 years ago

So I've had some success on this front by saving the data into an hdf5 file instead of a numpy file, it seems to be a lot smaller. The key is to convert the data to an 8 bit integer, and then save the the depth conversion factor as meta-data in the hdf5 file so that when I go to re-access the data I can just multiply the depth information at each pixel by the conversion factor to get the actual depth information if I need it. The file sizes of the hdf5 file are much more manageable this way.

I hope that helps!

RealSenseCustomerSupport commented 5 years ago

Thanks for the sharing and update @jkenney9a, From what you mentioned, I did some search over the internet, and saw various discussions about performance of hdf5, npy, etc.

madanmch, wonder if there is any update from you? And here is another approach to think abut.

Thanks!

madanmch commented 5 years ago

@jkenney9a for posting the method to save depth data into 8-bit integer format. I am good with this approach. Closing this issue now

filipematosinov commented 5 years ago

I am trying to follow this approach but when converting to int8, how can get the original precison again? (or as close to it as possible)

jkenney9a commented 5 years ago

I make sure to save the depth conversion factor as metadata in the hdf file. Then, to get the actual depth data you're interested in you multiple the int8 numbers by the depth conversion factor. You may need to convert the data to a float before multiplication, but you can do this as you go. I've actually decided to not go this route and just deal with keeping the bag files around instead since it turns out I need other information from them, but this approach should still work I think.

filipematosinov commented 5 years ago

I am now also keeping the bag files and then extract the information offline. However, if I am correctly by saving the depth as int8 we lose quite a lot of precision right? I mean int 8 has only 256 different values which is not even enough for to have a cm precision.

jkenney9a commented 5 years ago

That's a good point, could always try int16 too. Since I didn't end up going this direction I didn't fully vet this strategy. But if you are keeping bag files around (like we are) then I suppose this all becomes moot. Our goal is to only extract one particular depth coordinate based on some tracking being done on the RGB videos.