IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.49k stars 4.8k forks source link

rs-record frame drops? #10042

Closed Cpeck71 closed 2 years ago

Cpeck71 commented 2 years ago

Required Info
Camera Model L515
Firmware Version 01.05.08.01
Operating System & Version Linux 18.04
Kernel Version (Linux Only) 4.9.253-tegra
Platform Jetson Nano
SDK Version 2.49.0
Language {C++}
Segment Perception

Issue Description

<Describe your issue / question / feature request / etc..>

Howdy Realsense Group!

My main question is: Is it considered a "frame drop" if the number of frames counted in the callback function is not the same as what is recorded into the rosbag when using the rs-record tool? For one 10sec test, I recorded 184 depth/infrared/colorframes, 253 accel frames and 251 gyro frames. However, when opening the rosbag (using matlab) I get 130 depth, 153 color, 254 accel, 252 gyro frames (im ignoring infrared for now since I dont need it). Followup Question: Is the discrepancy based off of latency due to the time it takes to write the data, and is there something I can do (i.e. increase the frame_queue size, something that usually seems to be dicouraged) to be able to record every frame that comes through the pipeline? Followup question for fun: are the depth and camera frames automatically aligned, can they be aligned before placed into the rosbag, or do I need to perform some alignment based on frame capture time to get the closest frames and then use the align tool in some post-processing code?

Some extra context: I've been working with my L515 for some time and while I've learned a lot from the examples/tools and other online sources, I can't seem to find the right answer to a question about frame drops. Specifically, I've been learning some of the different methods to gather data using callbacks from examples such as rs-callback, rs-pose-and-image, rs-record-playback etc.

From the rs-callback example, my understanding is that it shows a count for EVERY frame/frameset that the device captures. Now, looking at the rs-post-processing example, it uses frame queues to process the depth data in the callback, enqueue it to the proper queue, then display the frame during the main while loop. I've tried to use both of these principles to essentially use the callback to align the depth and camera frames and place them in their own queues. Then in the main while loop I poll for frames and display both images using opencv and the frame_to_mat method. I also count how many frames are displayed in the main while loop vs. the callback and there's also a discrepancy there...I suspect due to the extra processing needed to display the frames.

Thank y'all so much for your help! C. Peck

Edit: Fixed some formatting

MartyG-RealSense commented 2 years ago

Hi @Cpeck71 At https://github.com/IntelRealSense/librealsense/issues/7488#issuecomment-704124850 a RealSense team member provides an explanation about how frame drops are calculated in the RealSense SDK.

If a rosbag does not contain all of the frames that were generated during recording then they could be considered to be 'dropped frames'. Loss of frames from a recording may be caused by a range of factors, such as a slow hard drive not being able to keep up with the rate that frames are being delivered and causing a bottleneck in the recording process, or a CPU's usage percentage nearing or reaching 100%.

If you are using more than one stream type simultaneously then you could consider increasing the frame queue size from its default of '1' to a value of '2'. This introduces more latency into the stream, which can help to reduce the likelihood of dropped frames. There is the risk though that using a custom frame queue size may unintentionally break the streams, so '2' would be a safe value to use. This is described in the Frame Queue Management documentation at the link below.

https://github.com/IntelRealSense/librealsense/wiki/Frame-Buffering-Management-in-RealSense-SDK-2.0#latency-vs-performance

Even then though, it may not guarantee that every frame will be recorded. An approach that you could explore is using the save_single_frameset instruction to save every frameset into its own tiny individual bag file, as described for C++ in https://github.com/IntelRealSense/librealsense/issues/3671#issuecomment-479767187

Depth and color frames are not automatically aligned. Although they can be aligned, the aligned frames cannot be saved into a rosbag. A rosbag acts like a video recording that contains the data of individual stream types. Once a bag file is loaded into memory then the retrieved streams can have alignment applied to them in real-time.

Regarding your final question about the workings of rs-callback, I do not personally have sufficient programming knowledge about callbacks to answer that particular question, unfortunately.

Cpeck71 commented 2 years ago

Thank you @MartyG-RealSense for the quick reply!

I suspect the dropped frames is due to a bottleneck somewhere, considering I'm running on a nano. Thanks for the info on the alignment, I suspect I'll probably do a post-record process to align them.

I've looked at the queue management documentation a few times. I think i have some confusion on whether the pipeline needs a larger frame queue or my user-defined queues need to be larger. Since the documentation suggests the pipeline has its own queue to sync the frames. (I'm kind of thinking out loud at this point) The callback is counting each frame properly, and it's only when other processes try to extract the frames from the queue that there is a drop. So would increasing the size of the user-defined frames act as a buffer in the midst of the assumed hardware bottleneck? Right now my understanding is that with a queue size of '1', if there is currently a frame in the queue, and a new frame comes along and is 'encueued' the prior frame is dropped if it didnt get 'polled' before the new frame arrived? So increasing the queue size would allow for more than one frame to be sitting in the queue ready to be polled?

Currently, my main purpose for all of this is to post-process some real-time data of the sensor. Since I see that X number of frames comes through the sensor, but only Y of the frames are able to be saved (probably due to hardware), I'm trying to see if theres a way to hold those frames until the 'save' process can be performed such that all X frames eventually get saved. Cheers

MartyG-RealSense commented 2 years ago

A RealSense team member in https://github.com/IntelRealSense/librealsense/issues/5041#issuecomment-542241323 explains that RS2_OPTION_FRAMES_QUEUE_SIZE limits the total number of frames in circulation. A frame queue size of '1' provides minimum latency, favouring performance but with an increased risk of frame drops.

You can use an SDK instruction called Keep() to store frames in memory instead of writing them to storage during streaming. Once the pipeline is closed then you can perform batch processing operations on all of the frames at the same time (such as post-processing and alignment) and then save all the frames to file in a single action.

Because Keep() stores frames in memory, it is only suited to short recording durations though because memory capacity is progressively consumed as time passes. For a device with modest memory size, 10 seconds would be an appropriate maximum, whilst computers with higher memory capacity could record for 30 seconds.

A couple of C++ examples of scripts that use Keep() are at https://github.com/IntelRealSense/librealsense/issues/1942#issuecomment-400031901 and https://github.com/IntelRealSense/librealsense/issues/2223

Cpeck71 commented 2 years ago

Looking at #1942, further comments mention either frame or frameset could be 'kept' using keep(), right? So i could 'keep' a whole frameset and use rs::align to align the depth and color frames? Rather than keeping both depth and color, and then running some script to align the two?

After reviewing the #5041 coment and some others, I think I still have some confusions about the frame size. When you say 'in circulation', is that in ciruclation of the pipeline? As in, if I do a pipe.wait_for_frame with a RS2_OPTION_FRAMES_QUEUE_SIZE of '1', I'm only going to get one frame or frameset? (typically i see framesets, which AFAIK collects all of the frames from each sensor) If the queue_size was, say, '2' would that mean that, if the first frameset was not polled yet, a second frameset would be buffered into the queue rather than dropped? I think my goal was to use the callback to extract frames/framesets from the pipeline (without messing with it's queue frame) and throw them into separate queues that can have a large capacity, and then save frames from those queues as fast as possible. I'm not sure that would really work if the record_to_file happens in the pipeline though.

thanks so much for your time/patience!

MartyG-RealSense commented 2 years ago

I will refer you to a complete C++ Keep() script at https://github.com/IntelRealSense/librealsense/issues/6865#issuecomment-661024268 shared by a RealSense user that incorporates alignment and multi-threading. This should hopefully be a useful reference for you.

Yes, rs2::align would be used to perform the depth-color alignment on the frames that have been collected using Keep(). The above script link demonstrates that principle.

In regard to the frame queue, by default the queue can have 16 frames of each type in it at any one time, and old frames get pushed out of the queue by newly arriving frames, like a continuously moving conveyor belt where items fall off the end. This is referred to at https://github.com/IntelRealSense/librealsense/issues/2711#issuecomment-437994827 which advises increasing the frame queue size and calling Keep() for every frame you enqueue if you want to keep more frames in the queue.

Cpeck71 commented 2 years ago

Is there a function to view the current volume of a queue? My understanding from the documentation is that the queue.size() returns the capacity of the queue but not necessarily how many frames are currently in the queue.

MartyG-RealSense commented 2 years ago

The SDK has a C++ example tool called rs-data-collect for profiling the streams using callbacks and logging the result to file.

https://github.com/IntelRealSense/librealsense/tree/master/tools/data-collect

The tool uses the SDK's Low-Level API which directly accesses the camera hardware in order to minimize software-imposed latencies. RealSense applications typically run in the High-Level API.

https://dev.intelrealsense.com/docs/api-architecture#section-low-level-device-api

Cpeck71 commented 2 years ago

Thanks so much @MartyG-RealSense for your guidance! You've helped explain a lot and point me to some additional resources. I'll close the issue for now as I continue to work through and learn!

MartyG-RealSense commented 2 years ago

You are very welcome, @Cpeck71 - thanks very much for the update, and good luck with your work!