How is your depth image implemented?

Hi,

Thank you for sharing your artifact. This is excellent work. I have a question about how your depth image is implemented. I noticed that when I am using it in the original scenario, obstacles that are further away are white, while obstacles close to the drone become black. I also noticed that both the floor and sky are rendered as black (and therefor must have the same values). You can see this below:

original

What confuses me, is that both the floor, sky and obstacles really close to the drone are rendered as black (they must all have similar values in the depth image). If this is true, how is the drone able to avoid obstacles close to the drone, and avoid flying into the ground?

The reason I am asking is because I am trying to use my own depth image. For example here you can see a Unity simulation where the drone is highlighted in the red circle. Above that you can see both the image from the drones camera, and the generated depth image. The depth image is generated using the Monocular Depth Estimation MiDaS algorithm.

mine

You will notice that the both the obstacle and floor in my example are registered as white in the depth image (As they are closer to the drone than the sky). This is opposite of your approach which has closer obstacles as black. In my example, the drone flys into the obstacles (which can be explained by my depth image being opposite to yours). However before I go and try and invert my depth image, I was hoping you could provide a few details on how your depth image is constructed (so that I can replicate it, i.e. why the floor, and the sky are both the same depth as obstacles that are close to the drone etc).

Thanks again for this awesome work!

I have made some progress on this. Below I have attached a video where I inspect different depths in the image. The depth is shown in the bottom right. Objects closer have smaller depth values (as we expect). The closer to the drone, the darker the object (as they have smaller depth images).

https://user-images.githubusercontent.com/26097148/226366954-d55a5759-d0ef-4a36-a93a-9a96d542412c.mp4

Mine was opposite (which is incorrect). Turns out MiDaS produces inverse depth maps, which explains why here you can see how the floor further-est away has the smallest depth value.

https://user-images.githubusercontent.com/26097148/226367834-cd273758-1daa-4d6a-8b19-097251a734bb.mp4

So I realize that mine is inverted and needs to be fixed. However I am still not sure how your approach handles having a floor (as in your simulation demo, the floor is rendered as 0).

ZJU-FAST-Lab / ego-planner-swarm

How is your depth image implemented? #53