DepthAnything / Depth-Anything-V2

Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
https://depth-anything-v2.github.io
Apache License 2.0
2.99k stars 223 forks source link

Metric Depth Focal Length #52

Open connorrich opened 1 month ago

connorrich commented 1 month ago

I am trying to use my own parameters for the depth_to_pointcloud method. More specifically, my image is 4000x6000 and my focal length as 18 millimeters. When inputting my parameters into this method, I get an extremely distorted point cloud. When looking at the given depth_to_pointcloud method the focal length values are much larger. In the given code it is:

# Global settings
FL = 715.0873
FY = 784 * 0.6
FX = 784 * 0.6
NYU_DATA = False
FINAL_HEIGHT = 518
FINAL_WIDTH = 518

Do I have to convert my focal length to something else for my specific case?

Edric-star commented 1 month ago

@connorrich Hi, if you keep the default NYU_DATA as False, then the code will consider FX and FY, you should replace the 784*0.6 with your own parameters as 18,and also modify the following codes to compute your camera coordinates, simply replacing the original FINAL_HEIGHT/2 and FINAL_WIDTH/2 with your CX and CY intrinsic parameters, the result should be much better.

        x = (x - CX) / focal_length_x
        y = (y - CY) / focal_length_y

Btw, I'm struggling with the outputs since the metric depth prediction of my 1920*1536 images deviate significantly from expectations, I guess it's due to the resolution of the pics. Are there anyone who also inputs images with larger resolution for example 1920×1536 and the metric depth prediction turns to be relatively good? Please kindly let me know how to cope with it.

connorrich commented 1 month ago

Im thinking that we have to convert our focal length into pixel units. That what seems like the most probable answer.

1ssb commented 1 month ago

This has now been updated. The testing is a bit limited on my machine which is headless, kindly check if you can get the results, I will otherwise make the requisite updates.

connorrich commented 1 month ago

This has now been updated. The testing is a bit limited on my machine which is headless, kindly check if you can get the results, I will otherwise make the requisite updates.

I'll be testing this once I get a chance. Quick question, what are the units of the focal_length_x(y)? Is this in pixels or mm?

Edric-star commented 1 month ago

This has now been updated. The testing is a bit limited on my machine which is headless, kindly check if you can get the results, I will otherwise make the requisite updates.

Hello, I have just seen your updated script and thanks for all those contributions. I just wanna query if there might exist some possible redundant codes ? Please check the following: Aftet the model outputs the pred res as follows: pred = depth_anything.infer_image(image, height) I print its shape which aligns with my original input shape, then I suppose it's not really necessary to resize the predsize like resized_pred = Image.fromarray(pred).resize((width, height), Image.NEAREST) Sorry to bother you, I know it won't make a difference on my final results but since you didn't delete this line, I'm not sure if I might miss something important.

1ssb commented 1 month ago

Please configure it as needed. As a requirement one cannot tailor to specific requirements in the release.

Distances are in meters and focal length is the same units as your image space, generally mm.

Best Subhransu


From: EdricHe @.> Sent: Monday, July 8, 2024 2:51:55 PM To: DepthAnything/Depth-Anything-V2 @.> Cc: Subhransu Bhattacharjee @.>; Comment @.> Subject: Re: [DepthAnything/Depth-Anything-V2] Metric Depth Focal Length (Issue #52)

This has now been updated. The testing is a bit limited on my machine which is headless, kindly check if you can get the results, I will otherwise make the requisite updates.

Hello, I have just seen your updated script and I just wanna query that if it might be a mistake? Aftet the model outputs the pred res as follows: pred = depth_anything.infer_image(image, height) I print it's shape which aligns with my original input shape, then I suppose it's not really necessary to resize the predsize like resized_pred = Image.fromarray(pred).resize((width, height), Image.NEAREST) Sorry to bother you, I know it won't make a difference on my final results but since you didn't move this line, I'm not sure if I might miss something important.

— Reply to this email directly, view it on GitHubhttps://github.com/DepthAnything/Depth-Anything-V2/issues/52#issuecomment-2213022291, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJWHFEFOYUCNELNAXYOMFD3ZLILGXAVCNFSM6AAAAABKKDKPGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJTGAZDEMRZGE. You are receiving this because you commented.Message ID: @.***>