Closed marcn68 closed 4 years ago
Hi,
The depth is computed from the stereoscopic images of the ZED, using the ZED SDK. It's independent of the 2D detector Yolo which has been trained typically on 2D images datasets like COCO and ImageNet. The object's depths are extracted with the 2D object positions and the computed dense depths.
I don't have in mind a specific dataset. If you want you could significantly modify the network to have input such as image and depth, in that case, you could use RGB-D datasets, simulation, depth from monocular images. Or you could directly compute the 3D objects from the stereo (or mono) images, similar to this https://arxiv.org/pdf/1909.07566.pdf
CenterNet includes 3D bbox detection on Kitti with a model (and a lot of 2D model variants, for keypoints of object detection)
Thank you for your answer. Now I know how it works.
Hello again,
I am facing some problems and I want to see if there is any solution for them:
I hope you could help with these problems.
The depth is not guaranteed to be completely dense. Sometimes there can be holes where the object is and the 3D information is unavailable.
These holes usually appear when there are occlusions, if the object is too far or too close, if the image is saturated (too bright) or too dark or if there's not enough texture to estimate the correlation between the left and right image.
You can tweak the way the object 3D position is extracted, either by taking a bigger searching radius, or lower some thresholds. The extraction function is here : https://github.com/AlexeyAB/darknet/blob/42d08fd820335584365d393da3967853676a8c35/src/yolo_console_dll.cpp#L38-L93
Hello, The depth displayed numerically (i.e. 10.2 m) corresponds to which pixel inside the yolo bounding box ? The center ?
@christiantheriault Yes the depth corresponds to the median depth around the center (the radius is currently 10 pixels)
Thank you. One more question ! I have a research project on stereo vision and distances computation. I will most likely order the Zed camera. Using depth, we can probably get the code to output the "real life" width and height of the Yolo bounding box. I mean the actual width and height of the "real" object. Right ?
Get Outlook for Androidhttps://aka.ms/ghei36
From: Aymeric Dujardin notifications@github.com Sent: Wednesday, January 29, 2020 3:32:16 AM To: stereolabs/zed-yolo zed-yolo@noreply.github.com Cc: christiantheriault christian.theriault@uqam.ca; Mention mention@noreply.github.com Subject: Re: [stereolabs/zed-yolo] Depth Calculation and 2.5D models (#21)
@christiantheriaulthttps://github.com/christiantheriault Yes the depth corresponds to the median depth around the center (the radius is currently 10 pixels)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/stereolabs/zed-yolo/issues/21?email_source=notifications&email_token=AMCL2IHPN76T7NYLP6CXOG3RAE5JBA5CNFSM4JFW4K72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKGL5LY#issuecomment-579649199, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMCL2IDOHNKYLWMR3KF5PD3RAE5JBANCNFSM4JFW4K7Q.
Yes, that's how the object detection module in the SDK 3.0 works.
That's great !!! Using the YOLO detector ?
Get Outlook for Androidhttps://aka.ms/ghei36
From: Aymeric Dujardin notifications@github.com Sent: Thursday, January 30, 2020 4:02:04 AM To: stereolabs/zed-yolo zed-yolo@noreply.github.com Cc: christiantheriault christian.theriault@uqam.ca; Mention mention@noreply.github.com Subject: Re: [stereolabs/zed-yolo] Depth Calculation and 2.5D models (#21)
Yes, that's how the object detection module in the SDK 3.0https://www.stereolabs.com/docs/object-detection/ works.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/stereolabs/zed-yolo/issues/21?email_source=notifications&email_token=AMCL2ICB3ONVWNSG26CRO5DRAKJQZA5CNFSM4JFW4K72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKKGSSQ#issuecomment-580151626, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMCL2IER7VRXG5TPYENPZU3RAKJQZANCNFSM4JFW4K7Q.
Hi @adujardin , I have a question regarding depth calculation. I have SVO files from the ZED2 camera. These files were used to export to corresponding png images (Left, Right & Depth map) using a python export file.
However, I'm not sure how to calculate the depth i.e. distance of every pixel from the camera. Currently, I'm using this formula:- *depth = baseline focal / disparity** Considering baseline = 12cm focal = 1000 pixels disparity = pixel values from the depth map. I'm getting very weird depth values. Some depth values are going to 60 meters. Attached are the corresponding images. I'm trying to calculate the depth of fish in the image.
@akshayklr057 If you exported the depth in png then it's already the depth value in metrics (millimeters to fit the png range value).
If it's the disparity, it won't work as the float value are needed and png can only store 16bit integers (0-65k). I suggest you use .exr
format, natively supported by OpenCV to save the disparity values (or a numpy array).
Your formula is correct but the focal is not a fixed value, it's a calibrated parameter that depends both on the resolution and camera used. You need to get the exact value from either the calibration file (typically in /usr/local/zed/resources/
) or from the API (sl.Camera.get_camera_information().calibration_parameters.left_cam.fx
https://www.stereolabs.com/docs/api/python/classpyzed_1_1sl_1_1CameraParameters.html)
@adujardin did you see the images I provided above? I exported the depth to a png image using zed-examples/python/export.py using the mode=3.
I need to get the depth in metrics in python using those exported png files. Could you please guide me on how should I do that?
Right now I'm reading those depth images using the OpenCV Imread() method ->which reads png in pixel range. Considering that pixel range as disparity, I used them in the above-mentioned formula.
Or maybe you could tell me how exactly I can get the depth in python if I have only .svo files.
I see, you saved the depth as a normalized image but you need float values. You can use the same sample with mode 4, it will output the depth in png 16bit in millimeters (the value is directly the depth in mm from 0 to 65m).
Thank you for the solution, it does work now.
Hi,
I am using Zed Camera with Yolo and everything is working perfectly but I have a question about the depth. Is the depth being calculated post processing with the Zed Camera or the model used by Yolo has the depth integrated in the model? Is it a basic 2D Yolo model or is it a special model for the integration with ZED camera?
And is there any 2.5D models/datasets to train or any repository that can help me with that kind of training?