Optimal depth resolution for D435 and D455

JeffR1992 commented 1 year ago

Required Info
Camera Model	D400
Firmware Version	NA
Operating System & Version	Win 10 / Linux (Ubuntu 18)
Kernel Version (Linux Only)	NA
Platform	PC
SDK Version	NA
Language	C, Python
Segment	Robot

Issue Description

I've seen the statement "848x480 is the optimal depth resolution" a few times, but each time this is mentioned, there seems to be no justification for this statement. As such, I wanted to ask why using a lower resolution of 848x480 is better than using a resolution of 1280x720? Please be as technical as possible when answering the question (i.e. information about hardware and/or image processing steps that you are allowed to talk about, and the decisions for using 848x480 instead of 1280x720 as the optimal depth resolution would be really appreciated).

From additional searches around the internet, the only explanation I could find was given in a 2018 post here: https://community.intel.com/t5/Items-with-no-label/D435-Resolution-VS-Depth-Image-Quality/td-p/548105

Where a community manager called idata said the following:

The reason the Plane Fit RMS error increases with resolution on the D435 camera is because the active pixels (see page 9 of the BKMs for tuning whitepaper and page 33 of the D400 datasheet) on the D435 are 1280x800. When the camera returns depth, it uses rectified images. During the rectification process, the image gets downsized. In order to get the 1280x720 resolution after rectification, the extra pixels are extrapolated. This is why the whitepapers say that the optimal depth resolution for the D435 is 848x480. Using a resolution higher than 848x480 should give you similar or slightly worse depth quality.

My questions to the above quote are the following: 1) Why are images downsized to 848x480 to perform image rectification? Is this a hardware processing limitation (i.e. if image rectification was performed at 1280x720 would it not be able to run the camera at 30 fps) or was a different constraint responsible for using a resolution of 848x480? 2) What type of extrapolation (did idata mean interpolation?) is performed on the images to increase their size from 848x480 to 1280x720? If the algorithm to do this is not proprietary it would be useful to know, as it would give me an idea of the limitations that I can expect from the upscaled image.

Any help regarding these questions would be much appreciated. Thanks very much.

MartyG-RealSense commented 1 year ago

Hi @JeffR1992 As you stated, the depth resolution for optimal accuracy on D435 / D435i and D455 is 848x480.

On the D415 model though, the optimal depth accuracy resolution is 1280x720. The D415 also has and better image quality and around 2x less error over distance than D435 / D435i.

The D455 has around 2x better accuracy over distance than D435 / D435i. So at 6 meters distance the D455 should have the same accuracy as D435 / D435i does at only 3 meters. The error over distance of a D455 should resemble the lower green curve representing D415 on the chart below when compared to D435 / D435i.

I am not familiar with the explanation provided by idata, so do not have a comment to make on that. However, the explanation for 848x480 versus 1280x720 can be expressed more simply than that.

The D415 excels at scanning small static objects and benefits from a high quality 1280x720 optimal depth resolution but has a small field of view and slow 'rolling' type shutters on both its depth and RGB sensors that makes it unsuited for capturing motion faster than human walking pace. Observation of fast motion can cause images to be blurred.

D435 and D435i have a slow rolling shutter on their RGB sensor but a fast 'global' shutter on their depth sensors that enable depth capture of fast motion, even as fast as a vehicle moving at high speed. They also have a larger field of view than D415, enabling the cameras to see more of a scene in its viewpoint without having to turn the camera.

D455 has a fast global shutter on both its RGB and depth sensors, enabling both color and depth to be captured without blurring.

On D435, D435i and D455 though, the trade-off for the benefits of a fast shutter and a wider depth sensor than D415 that lets more light into it is a lower optimal depth resolution of 848x480.

JeffR1992 commented 1 year ago

Thanks for the response @MartyG-RealSense. Interesting, so are you saying that if I choose a depth resolution of 1280x720 then all processing that is done on the D400, that is used to produce a depth map, is performed at that resolution (i.e., no downscaling of the image is required at any step in the processing pipeline)?

Unfortunately, I still don't understand why 848x480 is the optimal depth resolution for the D435, D435i and D455. From the D400 datasheet, it states that the infrared imagers on these sensors are the OmniVision OV9282 which have a native resolution of 1280x800.

1) If the sensors are capable of native resolutions of 1280x800, why do they require such a large downscaling to 848x400 to achieve an optimal depth resolution? 2) In what sense is the depth resolution of 848x480 optimal? Does it provide the best systematic error, random error or fill rate (or some combination of these 3 properties), or is it optimal in a different sense?

MartyG-RealSense commented 1 year ago

I am not aware of any downscaling taking place at 1280x720 resolution. You are certainly welcome to use 1280x720 and see whether it has any noticable reduction in accuracy compared to 848x480 in your particular project.
My understanding is that the 848x480 optimal resolution on D435 / D435x and D455 camera models is a consequence of the OmniVision sensor's use with a hardware design on cameras with a longer baseline (the horizontal distance between the left and right sensors), wider sensor FOV size and a fast global shutter that prioritizies high-speed depth capture over image quality - hence why the D415 model's slow rolling shutter and smaller sensor FOV size that lets in less light for less image noise results in 2x less image noise / 2x better image quality and 1280x720 optimal depth resolution.

OmniVision sensors are not designed specifically for RealSense camera products only. So any differences between the general OmniVision data sheet document and actual performance in a RealSense camera will likely be due to the sensor's integration into the overall hardware design of a particular camera model and its performance priorities.

JeffR1992 commented 1 year ago

Hi @MartyG-RealSense,

Could you confirm with your RealSense engineering team if any downscaling is taking place during depth estimation at 1280x720?
If the baseline dictates what the optimal resolution should be, then I am still confused, since the D435 and D455 have very different baselines (and the D415 has a longer baseline than the D435 anyway), yet I've seen the statement that 848x480 is the optimal resolution for both sensors. Again, in what sense is the resolution of 848x480 optimal? Is it optimal in the sense of minimizing systematic depth error, random depth error, fill rate or some combination of these? From your answer above it sounds like maybe 848x480 is optimal in the sense of giving a higher signal-to-noise ratio when compared to 1280x720, but not necessarily giving better values for other quantities (e.g. using 848x480 compared to 1280x720 might worsen systematic depth error due to pixel quantization effects, especially at corners and edges of objects)?

MartyG-RealSense commented 1 year ago

I had another look at the original 'idata' comment from 2018 and saw at the bottom of it that the real name of the Intel Communities support team member was Eliza. She was highly knowledgable and experienced about RealSense, had access to internal reference resources and her advice could be strongly relied upon. So whilst I have not previously encountered the information about 1280x720 rectification that she provided (as RealSense camera rectification algorithms is a topic that is not publicly documented or discussed), if she is the source of the advice then I would accept it as accurate.
The baseline will be a factor in the camera's image quality but I would not say that it would solely dictate the optimal resolution. The baseline will primarily have an influence on the image's quality when depth-sensing at a particular distance from the camera.

An example of this would be very close range depth sensing. If a D435 was moved closer to a surface than its minimum depth sensing distance of 10 cm (such as 7 cm) then even if you could adjust the settings to obtain a depth image at that very close range, the length of the D435's baseline would cause the image to be blurred. With the D405 camera model however, which has a short baseline designed for close range sensing (ideal range 7 cm to 50 cm), high quality, high accuracy depth images can be obtained at 7 cm distance from a surface.

There is not an official documented technical explanation though that expands upon the camera tuning paper's short statement of the optimal resolution. It is intended to be accepted as factual by the reader.

MartyG-RealSense commented 1 year ago

Hi @JeffR1992 Do you require further assistance with this case, please? Thanks!

MartyG-RealSense commented 1 year ago

Case closed due to no further comments received.

bkadlec commented 9 months ago

@JeffR1992 did you ever figure out if D435 rectification (and presumably the SGM disparity engine) runs at 1280x720 or 848x480 resolution?

MartyG-RealSense commented 9 months ago

Hi @bkadlec The discussion at the link below may be relevant to your question.

https://community.intel.com/t5/Items-with-no-label/D435-Resolution-VS-Depth-Image-Quality/td-p/548105

In that discussion, a RealSense support team member provided the following information.

The reason the Plane Fit RMS error increases with resolution on the D435 camera is because the active pixels on the D435 are 1280x800. When the camera returns depth, it uses rectified images. During the rectification process, the image gets downsized.

In order to get the 1280x720 resolution after rectification, the extra pixels are extrapolated. This is why the whitepapers say that the optimal depth resolution for the D435 is 848x480. Using a resolution higher than 848x480 should give you similar or slightly worse depth quality.

The optimal depth resolution of the D435 and D455 camera models is 848x480, but that image has been downsampled from 1280x480. The only RealSense 400 Series camera model that has 1280x720 as its optimal depth resolution instead of 848x480 is D415.

IntelRealSense / librealsense

Optimal depth resolution for D435 and D455 #11180

Issue Description