LiheYoung / Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
https://depth-anything.github.io
Apache License 2.0
6.39k stars 491 forks source link

Can anyone provide some assistance with converting points to the cloud? #93

Open amartincirera opened 4 months ago

amartincirera commented 4 months ago

Hello,

Thank you very much for your work; it truly produces spectacular depth images.

I am trying to generate the point cloud and wondering if I have done it correctly. I have obtained the following images using the depth anything model:

image image

When generating the point cloud with depth_to_pointcloud.py using the above images, I get the following results:

image image

The point of cloud of the building seems correct, but there are some black points at the top that I believe should be farther away but appear as if they are close. In the other point of clouds with the horse, I don't think the horse looks correct. Am I doing something wrong?

Thank you so much!

slyfooox666 commented 4 months ago

hi, if you are using the relative depth esimation or metric depth estimation, if its the former, then the genereated points may be incorrect due to the wrong depth information

hgolestaniii commented 4 months ago

If you use "relative depth estimation", your results are not metric/absolute, so you may see wrong depth values. If you use "metric depth estimation", you need to know which trained model you are using.

You should not expect to get RIGHT depth values if you do not train your own model.

amartincirera commented 4 months ago

Hi, I'm using relative depth estimation and haven't applied any transformation to the metric depth estimation. I saved the relative depth estimation image from line 91 of run.py:

cv2.imwrite(os.path.join(args.outdir, filename[:filename.rfind('.')] + '_depth.png'), depth) After saving the image, I simply executed: depth_to_pointcloud.py.

The thing is that the relative depth estimation image seems correct; I can see the walls and correctly identify the horse. However, the point cloud doesn't seem to align with the relative depth estimation image. The red and yellow points should be closer, but they're not. Should I transform it first to the metric estimation?

Am I doing something else wrong?

Thanks!!!!

hgolestaniii commented 4 months ago

Hi, I'm using relative depth estimation and haven't applied any transformation to the metric depth estimation. I saved the relative depth estimation image from line 91 of run.py:

cv2.imwrite(os.path.join(args.outdir, filename[:filename.rfind('.')] + '_depth.png'), depth) After saving the image, I simply executed: depth_to_pointcloud.py.

The thing is that the relative depth estimation image seems correct; I can see the walls and correctly identify the horse. However, the point cloud doesn't seem to align with the relative depth estimation image. The red and yellow points should be closer, but they're not. Should I transform it first to the metric estimation?

Am I doing something else wrong?

Thanks!!!!

If relative depth is enough, then you may upload your input image, your estimated depth image, and the point cloud here for an investigation. You may only need to rotate the point cloud and see it from the back. BTW, depth_to_pointcloud.py does METRIC depth estimation, AND point cloud generation. You did depth estimation yourself, and only used a small part of depth_to_pointcloud.py for 3D projection?

amartincirera commented 4 months ago

Hi @hgolestaniii,

Thanks for you help!

I used depth anything to get relative depth estimation, without any training, and then I executed depth_to_pointcloud.py (all the script).

Input image: image

Relative depth estimation using depth anything model (run.py): image

Cloud of points using .py: image

image

hgolestaniii commented 4 months ago

1- What you get from run.py is "colorized depth map". It is just for visualization. The pixel values are not depth values. If you want to get depth values you should use "--grayscale --pred-only" while calling run.py. 2- The script "depth_to_pointcloud.py" read input RGB images from a local folder (INPUT_DIR = './my_test/input'), and predict METRIC depth (not relative) using a pre-trained network (like indoor nyu). You may use the script wrongly. My suggestion is to read the "depth_to_pointcloud.py" code again and make sure you know what you are running.

amartincirera commented 4 months ago

1- What you get from run.py is "colorized depth map". It is just for visualization. The pixel values are not depth values. If you want to get depth values you should use "--grayscale --pred-only" while calling run.py. 2- The script "depth_to_pointcloud.py" read input RGB images from a local folder (INPUT_DIR = './my_test/input'), and predict METRIC depth (not relative) using a pre-trained network (like indoor nyu). You may use the script wrongly. My suggestion is to read the "depth_to_pointcloud.py" code again and make sure you know what you are running.

I followed your instructions to obtain the relative depth estimation correctly.

What I was doing wrong was that in the folder "INPUT_DIR = './my_test/input'", I was placing the image of relative depth estimation. When I use the original image, I correctly obtain the point cloud.

image

Thank you very much for your help.

vidit98 commented 2 months ago

Hi @amartincirera , I am also trying to get point cloud from relative depth but facing issues. I guess the output from network is disparity, did you convert to depth simply by 1/output_from_net? Did you normalize the values before converting it to depth? It would be great if you can share the script to get the point cloud from relative depth. Thanks!