IntelRealSense / librealsense

Intel® RealSense™ SDK
https://www.intelrealsense.com/
Apache License 2.0
7.6k stars 4.83k forks source link

Frame Queue drops frames when more than 1 stream is enabled. #9022

Open sam598 opened 3 years ago

sam598 commented 3 years ago

Tested with this modified version offrame_queue_example.py

import pyrealsense2 as rs
import time

prev_frame_num = 0

def slow_processing(frame):
    global prev_frame_num

    n = frame.get_frame_number() 
    if n % 20 == 0:
        time.sleep(1/4)

    if n - prev_frame_num > 2:
        print('dropped ' + str(n - prev_frame_num - 1) + ' frames')

    prev_frame_num = n

    print(n)

try:
    pipeline = rs.pipeline()

    config = rs.config()
    config.enable_stream(rs.stream.color, 848, 480, rs.format.bgr8, 30)
    config.enable_stream(rs.stream.depth, 848, 480, rs.format.z16, 30)

    print("Slow callback + queue")
    queue = rs.frame_queue(50)
    pipeline.start(config, queue)
    start = time.time()
    while time.time() - start < 600:
        frames = queue.wait_for_frame()
        slow_processing(frames)
    pipeline.stop()

except Exception as e:
    print(e)
except:
    print("A different Error")
else:
    print("Done")

This script will drop frames whenever time.sleep() is called. It does not drop frames when only 1 stream is enabled.

The amount of dropped frames varies depending on camera and platform. But I have seen this behavior on every combination of D435, D455, Raspberry Pi OS, and Windows 10.

SDK 2.45.0 and Firmware 5.12.13.50

sam598 commented 3 years ago

Creating a separate frame queue for each sensor seems to work.

import pyrealsense2 as rs
import time

ctx = rs.context()
device = ctx.devices[0]

depth_sensor = device.query_sensors()[0]
depth_queue = rs.frame_queue(50)

color_sensor = device.query_sensors()[1]
color_queue = rs.frame_queue(50)

def GetProfile(sensor, stream_type, width, height, stream_format, fps):
    profiles = sensor.get_stream_profiles()

    for i in range(len(profiles)):
        profile = profiles[i].as_video_stream_profile()
        if profile.stream_type() != stream_type:
            continue
        if profile.width() != width:
            continue
        if profile.height() != height:
            continue
        if profile.format() != stream_format:
            continue
        if profile.fps() != fps:
            continue
        return profile

    return profiles[0]

depth_profile = GetProfile(depth_sensor, rs.stream.depth, 848, 480, rs.format.z16, 30)
color_profile = GetProfile(color_sensor, rs.stream.color, 848, 480, rs.format.bgr8, 30)

prev_depth_frame_num = 0
prev_color_frame_num = 0

start = time.time()

depth_sensor.open(depth_profile)
color_sensor.open(color_profile)

depth_sensor.start(depth_queue)
color_sensor.start(color_queue)

while time.time() - start < 3600:
    depth_frames = depth_queue.wait_for_frame()
    depth_frame_num = depth_frames.get_frame_number()
    #print("D: " + str(depth_frame_num))

    color_frames = color_queue.wait_for_frame()
    color_frame_num = color_frames.get_frame_number()
    #print("C: " + str(color_frame_num))

    if depth_frame_num % 20 == 0:
        time.sleep(1/4)

    if depth_frame_num - prev_depth_frame_num > 2:
        print('dropped ' + str(depth_frame_num - prev_depth_frame_num - 1) + ' depth frames')
    prev_depth_frame_num = depth_frame_num

    if color_frame_num - prev_color_frame_num > 2:
        print('dropped ' + str(color_frame_num - prev_color_frame_num - 1) + ' color frames')
    prev_color_frame_num = color_frame_num

depth_sensor.stop()
color_sensor.stop()

depth_sensor.close()
color_sensor.close()

It drops a few depth frames right at the start, but then it runs fine. I was able to run this script for an hour with a D455 on a Raspberry Pi 4 with no dropped frames.

There really aren't any examples or documentation for creating separate frame queues in python, so this is a bit of a clunky workaround. One would expect that a pipeline with a queue would be able to handle multiple sensor streams without dropping frames, so I am leaving this issue open for now.

MartyG-RealSense commented 3 years ago

Hi @sam598 Thanks very much for sharing the Python code of your workaround!

The Latency vs Performance section of the frame buffering documentation suggests setting the frame queue size to '2' instead of the default of '1' if two streams are being used.

https://github.com/IntelRealSense/librealsense/wiki/Frame-Buffering-Management-in-RealSense-SDK-2.0#latency-vs.-performance

Changing the frame queue size to a custom value may though carry the risk of breaking the streams if the wrong value is chosen (i.e a value other than the 1 or 2 suggested in the buffering documentation).

https://github.com/IntelRealSense/librealsense/issues/5041#issuecomment-542241323

As an alternative to changing the frame queue size, dividing streams into separate pipelines may improve performance. In the Python script in the link below with a 2-pipeline setup, the IMU is placed on its own on one pipeline and depth & color on the other pipeline.

https://github.com/IntelRealSense/librealsense/issues/5628#issuecomment-575943238

sam598 commented 3 years ago

Thanks @MartyG-RealSense !

I think some of my confusion comes from the intended purpose of the frame_queue class.

This example makes it seem like the purpose of the queue is to create an asynchronous buffer so that frames don't get dropped if some other processing takes longer than usual. The example does explicitly say:

This stream will occasionally hiccup, but the frame_queue will prevent frame loss.

But with multiple streams it doesn't sound like it can be used for that use case at all. Having a buffer of 2 frames doesn't sound any different from waiting for frames from the synchronous pipeline. If the queue can only hold 1 frame per stream, does the queue have any purpose? If not is there a way to warn the the developer or return an error if they try to pass multiple streams into a single queue?

If the queue is really only intended for one stream at a time, it would be great to make that explicit in the documentation and examples. This caused a bug that I ended up spending over a week trying to track down.

As an alternative to changing the frame queue size, dividing streams into separate pipelines may improve performance. In the Python script in the link below with a 2-pipeline setup, the IMU is placed on its own on one pipeline and depth & color on the other pipeline.

I did not know if this was possible, so that may be a more elegant solution than my case.

MartyG-RealSense commented 3 years ago

For the vast majority of RealSense users, changing the frame queue size is a method that they don't need to think about or use even if they have more than one stream active and the default of '1' is sufficient. It is just a feature that is available if the user decides it is useful for their project, as a tool for changing the balance between performance and latency (with a possible risk of dropped frames when weighted towards performance).

sam598 commented 3 years ago

Hmm, the official frame_queue_example.py example I’ve referred to several times in this issue uses a frame_queue with a size of 50 to try and prevent dropped frames from happening.

Are you saying that this example will actually cause dropped frames? Something here does not make sense.

MartyG-RealSense commented 3 years ago

Whilst the advice was that using a custom frame queue size may introduce stream-breaking, it does not state which values do so. Therefore some values may benefit a project whilst other values may harm. There is a risk to using custom values for the frame queue size without documentation about which values are useful or harmful.

If you are aware of the possibility though then it may be safe to experiment. Otherwise, someone who does not know of the risk could unknowingly break their program and then spend a long time trying to debug it without being aware of the true cause. of the problem.

sam598 commented 3 years ago

I have to ask and clarify, is there a difference between the frame_queue class and the RS2_OPTION_FRAMES_QUEUE_SIZE setting? In the post you linked to from @dorodnic they seem to make this distinction.

  1. frame_queue is a class that creates an object useful for asynchronously processing frames. @dorodnic says to keep the value for this low if you want low latency, but does not actually say anything about it dropping frames. If setting large values for this class did drop frames (like you have said) then it would contradict the purpose of the frame_queue_example.py script from Intel.

  2. S2_OPTION_FRAMES_QUEUE_SIZE sets the size of an internal frame queue in the SDK. As you have said most users should not have to change this value, and changing it could cause issues and break the stream. But this is a configurable sensor option, and appears to be different from passing a frame_queue object to a pipeline.

Is my understanding of this correct?

sam598 commented 3 years ago

So to recap.

  1. Do not modify RS2_OPTION_FRAMES_QUEUE_SIZE.

  2. Use frame_queue to asynchronously buffer frames.

  3. Currently there is an issue where frame_queue cannot retain frames if it is being fed more than one stream.

  4. As a work around you can use one frame_queue per stream.

  5. It is not clear if frame_queue is broken, or working as intended. It either needs to be fixed, or documented better.

MartyG-RealSense commented 3 years ago
  1. I picked up the information about dropped frames when weighted towards performance years ago and the precise source of it is not certain now. I believe the logic was based on the queue potentially dropping frames if it could not keep up with the rate that new frames were arriving in the queue. The closest information source I know of that relates to this is in the link below.

https://github.com/IntelRealSense/librealsense/issues/5041#issuecomment-542241323

  1. Frame queue size sets the size of the frame queue. frame_queue provdes a means of passing frames between threads in a thread-safe manner, as mentioned in the frame buffering documentation.

https://dev.intelrealsense.com/docs/frame-management#section-frames-and-threads

sam598 commented 3 years ago

Frame queue size sets the size of the frame queue.

To be absolutely clear; RS2_OPTION_FRAMES_QUEUE_SIZE is different from frame_queue

From https://github.com/IntelRealSense/librealsense/issues/5041#issuecomment-542241323 :

This option does not strictly configures any particular queue but rather limits the total number of frames in circulation

sam598 commented 3 years ago

The frame_queue object has a size which can be set, and that object can be passed to a sensor.

RS2_OPTION_FRAMES_QUEUE_SIZE is an internal SDK option that can configured on a sensor with the set_option function. But it should be left as is.

These two things are extremely confusingly named but it seems that they are different.

I just want to make sure we are on the same page, because understanding what frame_queue is and how it works is essential to whether or not there is an actual issue here.

MartyG-RealSense commented 3 years ago

The documentation defines RS2_OPTION_FRAMES_QUEUE_SIZE with this description: "Number of frames the user is allowed to keep per stream. Trying to hold-on to more frames will cause frame-drops".

https://intelrealsense.github.io/librealsense/doxygen/rs__option_8h.html#a8b9c011f705cfab20c7eaaa7a26040e2a3576239e61d44f58446383c37401d76b

My understanding is that frame queue size is the same thing as frame queue capacity, which can be used to set a maximum latency of a defined number of frames, as described in the SDK example script 'Do processing on a background thread' (see the section of this link headed 'librealsense2').

https://github.com/IntelRealSense/librealsense/wiki/API-How-To#do-processing-on-a-background-thread

I do not think that there is much else that I can add about the subject that has not already been covered in this discussion however.

sam598 commented 3 years ago

I've tried my best to explain how I understand @dorodnic's previous advice, but I agree there is only so much the two of us can try and parse that comment without an official statement.

MartyG-RealSense commented 3 years ago

Hi @sam598 Do you agree that this case can be closed now, please? Thanks very much.

sam598 commented 3 years ago

Hi @MartyG-RealSense

I need to know there is a way I can use the frame_queue class to prevent dropped frames, which is what Intel’s example claims. This is a critical feature for me.

I appreciate your assistance, but the comments in this issue have thrown that entire use case into doubt.

From the documentation and comment it seems clear to me that the frame_queue class and the internal RS2_OPTION_FRAMES_QUEUE_SIZE setting are different things with different purposes. But you suggest otherwise.

If it is as you suggest, that contradicts the example and means frame_queue will actually cause dropped frames.

You have also expressed that you do not know this 100% for sure, and I do not claim too either. At this point I need assistance from someone who can speak with authority on the matter.

Thank you for your understanding.

MartyG-RealSense commented 3 years ago

I will seek advice from Intel on this subject. Thanks very much for your patience.

ev-mp commented 3 years ago

@sam598 , in this case the frame queue is only symptomatic to the actual difference - In the two code samples you share you're utilizing two different SDK APIs to generate frames: Pipeline and Sensor API. While those two are intended to generate frames they differ fundamentally in the way the frames are processed en route: While with the sensor API each sensor's feed is handled separately and delivered to user application with minimal latency, the frames that flow via the pipeline API do pass two additional block - **syncer** and aggregator (less important) which purpose is to find the best match and generate frame bundles. So by definition, the Pipeline API provides temporally-aligned frame sets at the expense of (a) imposed latency for some of the streams and (b) some frame drops which. Again this is done to enforce temporal alignment which is critical in cases such as Pointcloud and Depth-RGB alignment.

So I believe now the real question is to establish how many frame drops are registered to ensure that the syncer throughput is within the spec.

sam598 commented 3 years ago

Thank you @ev-mp. That information clears up a lot and is extremely helpful.

(1) To answer your question first:

So I believe now the real question is to establish how many frame drops are registered to ensure that the syncer throughput is within the spec.

When using the Pipeline API and only 1 stream (like in the frame queue example), I do not receive dropped frames when calling time.sleep(). When using 2 streams in this example I receive 1-12 dropped frames (depending on camera and os) whenever sleep is called.

Even when not artificially stalling the thread when using the Pipeline API, I would have 1 dropped depth or color frame every 1-10 minutes. This is with nothing blocking the thread or CPU. Simply waiting for frames, checking the frame number, and waiting for the next one. As far as I can tell there was no recognizable pattern, rhyme or reason to the dropped frame. This would also happen regardless of the inter_cam_sync_mode setting being 0 or 1, so even streams running at a consistent internal frame rate would drop frames. I had spent several weeks trying to "fix" this problem, and it would show up in any instance or configuration.

(2) As for preventing dropped frames:

The most critical thing for my use case is that I capture every frame from both streams without dropping any. Based on what you have said it looks like I have no need for the Pipeline API or **syncer** object. Since my focus is on capture, latency and aligned frame bundles are not important, since everything gets processed later and I can pair frames after the capture. Also recording a ROSBAG is not an option.

Should I be able to prevent dropped frames by providing a large frame_queue to the Sensor API? Should this example I provided work in theory?

Finally, is providing a new frame_queue object to the sensor the same thing as changing that sensor's RS2_OPTION_FRAMES_QUEUE_SIZE setting?

ev-mp commented 3 years ago

@sam598 hello, thank you for the inputs as it also clarifies both the application and the constrains. Regarding the first part -

When using 2 streams in this example I receive 1-12 dropped frame

I'll clarify the on the 'Pipeline' internal mode - the Pipeline is an API that wraps the Sensor API, which means that internally it runs asynchronously, but from the user's perspective it provides synchronous (blocking) calls that allow to use it in singe-threaded application flow. (Note that Pipeline API can also be invoked with asynchronous callbacks, which may reduce frame drops but complicate the user's code in a same way Sensor API does). So when you're stalling the main thread, the Pipeline continues to run and acquire the frames from HW and inject them to the syncer.
So it transforms into producer/consumer issue - while the main thread is blocked, the internal frame queues of the syncer object are getting full (internal size is 16 frames IIRC), and when it happens internal drops occur. For 250msec @30FPS frame drops can be at ~8-9 frames range, but if you add to it Depth/RGB bundling then it may extend to over 10 drops.

As for preventing frame drops - imo the Sensor API is most appropriate method, so yes - the link to the second example should work. Make sure to pre-allocate sufficient frame queues (record @30 fps=> queue size 1800 per minute (per sensor).

As for the last question - the Sensor has zero-sized queue and has no tolerance for latency. So if its callbacks are not processed and released immediately in the user app, it will start dropping frames instantaneously.

MartyG-RealSense commented 3 years ago

Hi @sam598 Do you require further assistance with this case, please? Thanks!

sam598 commented 3 years ago

@MartyG-RealSense I think that clears everything up.

So to recap:

I would propose that for this issue to be properly closed the frame_queue_example.py script should caveat that the given Pipeline example only works for a single stream. I would hate for someone to spend days on the same issue, and lengthy GitHub issues like this one are not a substitute for proper documentation.

MartyG-RealSense commented 3 years ago

@sam598 I have added a Documentation tag to this case to highlight your documentation request described in the comment above. This case should be kept open whilst that documentation request is active, but there is nothing else that you need to do. Thanks very much!

cwm9cwm9 commented 1 year ago

I wanted to comment on this after doing a bunch of research into the code. I'm not sure everything I think I know is right, but this is what I understand so far. Take everything written here with a grain of salt.

First, on the issue of RS2_OPTION_FRAMES_QUEUE_SIZE.

This is option is, in my humble opinion, poorly named. Inside the library, a pool of empty frames is maintained. From this pool, unused frames are claimed and filled with color, infrared, and depth information as it arrives from the physical sensor. (I don't think framesets themselves draw from this pool, only the frames inside the frameset, but I'm not sure about that.) No matter who has the frame --- library, queue, or you --- or what it is being used for, so long as the frame is being used by somebody and must not be erased, it has been "published." When the frame is .released() (c++) or .closed() (java), the frame is returned to this pool.

If the pool ever becomes empty, libRealSense will start dropping frames --- not because your queue is full, but because it has effectively "run out of memory" to allocate new frames. The difference between this situation and a full queue is that when a queue becomes full the queue will drop the oldest frame and hold onto the newest frames; but, when the frame pool is exhausted, the library cannot acquire a new frame and thus it must drop the incoming sensor image data.

If you fail to unpublish your frames for any reason, the library will stop storing new frames as the arrive from the sensor. This is the, "I got 15 frames and then didn't get any more," complaint. When the library is closed, a message will be (sometimes) be printed telling you how many frames you were "holding" when the library was closed. This is really how many frames were taken out of the frame pool, but not returned to the frame pool, for each stream. This is usually 1 for each stream (if 0 nothing prints), because the library itself may have been holding frames for you internally. But if the number is high, like 12, or 14, then you have failed to release your frames and framesets when you were done with them (and probably you had a frozen stream.)

(Note: This paragraph applies to Java and may or not apply to other languages.) One important key thing to note is that only the Frame object owns the underlying frame-pool frame. If you cast the Frame using .as(...) to a DepthFrame or a VideoFrame, this "recasted" frame does NOT own the underlying frame-pool object and "releasing"/"closing" it does nothing. You MUST close/release the Frame from which the VideoFrame/DepthFrame/Whatever was created, or the frame will not be returned to the pool, but you must NOT close/release it until you are done using any casted DepthFrames/VideoFrames/etc. (Example: if you cache a VideoFrame, you must also cache the underlying Frame so it can be closed later.) Closing a FrameSet closes all UNREMOVED frames in that FrameSet. If you get a new FrameSet after processing a FrameSet (with, say, align), I BELIEVE you must close them both and can close the unprocessed set before you use the frames in the processed set, but I'm NOT SURE ABOUT THAT. I have, so far, been unable to get align to work properly in my own code.

In order to not drop frames, the size of the pool must be sufficient for all users of the pool -- the library for incoming image data, the syncher, the queue, frames created by filters, and you -- to hold on to whatever frames they require. Setting the pool to 1 will certainly cause problems! Setting it too large could cause you to run out of memory. What this setting does NOT do is control how many frames are queued, or in what way they are queued, or how many streams you can use.

RS2_OPTION_FRAMES_QUEUE_SIZE sets the size of this pool, in frames, and because it represents nothing more than a frame-based memory pool, this is why you are told not to change the value. By default, it starts at 16 and will grow as high as 32, and that should be enough for most programs.

However, there are two situations where you might want to alter this number.

One is if you wish to grab frames and hold onto them for a while because you are processing multiple frames at a time, because you need a big queue, or because you are holding multiple frames generated by multiple filters. (Remember, even if the Queue has your frames, they're still published.) In this case, you might need to raise the number.

The other is if you are on a low memory platform, you may wish to decrease this number so that if your software hits a brief pause the library does not drain all system memory resources trying to buffer past frames excessively.

My current working theory is the minimum value for the frame pool setting is 4 times the number of streams you wish to use minus one. You need 1 frame per stream for what you are currently processing, 1 frame per stream minus one for the syncher to hold while it waits for a coherent set of frames, 1 frame per stream sitting in the queue as a FrameSet waiting for you to remove, and 1 more frame per stream so that arriving data from the camera can be stored by the library. Thus, for 2 streams, a minimum pool size of 7 is required. This assumes you are not using any filters, as filters generate additional frames. Of course, the default is 16 with a maximum of 32, and if you wish to have a queue of size greater than 1 or make use of the available filters (such as align), you must increase the size of the frame pool above 7, and potentially above 32.

As for FrameQueue, it appears to be a simple FIFO "frame" queue. If I understand the code correctly, the internal queue has the ability optionally to block on enqueue until room is available, but as far as I know this option is always turned off in the library and the queue instead always responds to overcapacity situations by dropping the oldest unclaimed data and accepting new data. FrameQueue, despite its name, can hold either Frames or FrameSets, and if you are supplying it with FrameSets (from, say, pipline or syncer) then a queue depth of 1 will hold 1 FrameSet containing both streams. If, however, you are suppling it with frames from sensors directly (not as framesets), you can run into trouble: you need a minimum depth of 2 (for 2 streams) so that the two streams do not overwrite each other's frames --- but, if one sensor for any reason gives two frames before the second sensor gives one frame (say, because they are at different FPS or are physically different sensors with slight timing variations), then the queue will sometimes contain two frames from one sensor and none from the other. Having separate queues for each sensor eliminates this potential problem.

Pipeline uses a queue internally if you are not using callbacks, but queue is set to a size of 1. That is, it always drops any unclaimed old frameset and replaces it with the new incoming frameset. If you want a queue that is larger than 1, you would pass in the FrameQueue::enqueue method to the pipeline (or modify the source code, but that seems unncessary). I don't believe there is any difference at all between using the pipeline directly and redirecting the output to a FrameQueue with a length 1: If you want to poll the pipeline, just do a Pipeline.waitForFrames(0).

If on the other hand, you want to queue up frames and not drop them, you must create a FrameQueue with a size larger than 1. Obviously, you must process the frames before the queue fills up, or you will start dropping the oldest frames --- or the newest if RS2_OPTION_FRAMES_QUEUE_SIZE is set improperly.

If you are getting the same Frame multiple times from one or more streams in sequential FrameSets or dropping every-other-frame, this is likely because the syncer stopped getting frames from one or more of your streams. The most likely cause of this is that the library could not obtain a frame from the frame pool, and as a result the syncher reused an old frame that it had stored previously. This can happen when failed to release one stream's frames but are correctly releasing other stream's frames.

If you get intermittent drops but are certain that you are closing your frames properly, you may have a real out-of-memory condition where frames from the frame pool are getting memory-swapped to disk causing the library thread to hang and lose data: try reducing the value of RS2_OPTION_FRAMES_QUEUE_SIZE (and the Queue, if appropriate) to reduce your memory footprint.

If you get "couldn't allocate a composite frame", you are not freeing your Frames and the pool is completely exhausted.

The bottom line is, if you want to not drop any frames, do these four things:

1) Make sure you .close() / .release() any frames you get from the library as soon as you are able to prevent draining the available frame pool. Make sure you are closing the original Frame, and not the recasted frame.as(...) VideoFrame/DepthFrame/etc. Check your log to be sure when you close the pipeline you are not holding onto frames. If your stream freezes, you are probably holding onto frames.

2) Set RS2_OPTION_FRAMES_QUEUE_SIZE large enough that there are enough frames in the frame pool for the library to get data from the camera, for the syncher to store internally while making framesets, for the framesets to sit in whatever queue you might have, and for you to be actively processing once you get them out of the queue and before you release/close them, but not so large that using the entire framepool will exhaust all available physical RAM memory.

3) Set your Queue large enough that any small processing delays on your part are smoothed over. (If you're processing fast enough, a queue depth of 3 should be plenty. Because of the way Syncher works, I could see situations where streams with different FPS could lead to getting 2 frames in rapid succession, so a 3 frame buffer helps guarantee you have at least 2/FPS time to process 3 frames.)

4) If for any reason you need to change the stream configuration (resolution, which streams you are looking at, etc.,), it is not enough to close the Pipline and reopen it: you must create a NEW Pipeline object. Reusing the old Pipeline may initially appear to work, but results in unexpected behavior. (You do not need to reinitialize the context.)

MartyG-RealSense commented 1 year ago

Thanks so much @cwm9cwm9 for sharing your experience in such detail!