Question about general use in any case

LearnTechWithUs / Stereo-Vision

This program has been developed as part of a project at the University of Karlsruhe in Germany. The final purpose of the algorithm is to measure the distance to an object by combining two webcams and use them as a Stereo Camera.

MIT License

355 stars 114 forks source link

Question about general use in any case #15

Open Petros626 opened 1 year ago

Petros626 commented 1 year ago

Hello,

I would like to know, if this script can measure the distance of any object shown in a stream, no matter where it's located? If not, what adjustments in the code are necessary?

I plan to use a stereo camera and a pretrained CNN, which detects objects, additionally I want to measure the distance to the detected objects.

Thanks in advance

shanearthur commented 1 year ago

Hey Petros626,

The script will produce a two dimensional array of values which represent the depth of each pixel in the scene, however due to the nature of this traditional method, not all pixels will have values attributed to them. Some ill-posed regions (areas where the algorithm has a hard time determining depth) will be left empty.

basicblockmatching1

As long as the object is within the image/frame from your stream and a value which represents the pixels of your object is within the depth map, you will be able to determine the estimated depth. I'd recommend doing some averaging of small kernels depending on what kind of object it is.

Note: the resulting values will be disparity values, which you will use to calculate metric depth with depth = baseline * focal_length / disparity. For more see here.

Petros626 commented 1 year ago

Thank you for your detailed answer. The most algorithms I saw was with a known object size, but this wasn't which I was searching for. So if with the mentioned formula you can calculate the distance why in your code specific values are made here:

https://github.com/LearnTechWithUs/Stereo-Vision/blob/597d9e5d4fdeb96583339365f41bf9488ea5c9fd/Main_Stereo_Vision_Prog.py#L37

shanearthur commented 1 year ago

why in [the] code specific values are made here

Those are likely the specific parameters provided by datasets which the author was using to test this code. They may also be customized to compensate for the averaging being done two lines above your referenced line, where the author is getting the average disparity value of a small kernel of pixels around the pixel in question: https://github.com/LearnTechWithUs/Stereo-Vision/blob/597d9e5d4fdeb96583339365f41bf9488ea5c9fd/Main_Stereo_Vision_Prog.py#L35

Vujas-Eteph commented 9 months ago

Yes, as @shanearthur said. Those values (polynomial coefficients) are estimated based on a custom dataset. (Similar to issue https://github.com/LearnTechWithUs/Stereo-Vision/issues/5). When and how we estimated those parameters is shown in this section of the YouTube video. We plotted a curve in which we had the distance on the x-axis and the disparity values on the y_axis (if I remember correctly). Afterward, we did a polynomial regression of degree 3 (known from the literature) via Excel - but you can do that with Python libraries to optimize the value.