the precision of infering result

DepthAnything / Depth-Anything-V2

Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

https://depth-anything-v2.github.io

Apache License 2.0

3.39k stars 277 forks source link

the precision of infering result #114

Closed SoulProficiency closed 2 months ago

SoulProficiency commented 2 months ago

using the same image,vitb is totally a black image but vitl can get a common depth map.

heyoeyo commented 2 months ago

If you get a fully black image, there's a good chance it's because one (or many) of the predicted values came out as infinity. You can potentially 'save' the rest of the data by just removing infinite values before any further processing.

You can do the following to remove infinite values:

# Set infinite values to 0 to avoid further numerical errors
inf_mask = np.isinf(data)  # or use data.isinf() if it's a pytorch tensor
data[inf_mask] = 0

SoulProficiency commented 2 months ago

data[inf_mask] = 0

thanks a lot,its useful for me.(vitl is actually better than vits/vitb)

heyoeyo commented 1 month ago

(vitl is actually better than vits/vitb)

I find the same is true! If you haven't already tried it, increasing the 'input size' script parameter can also improve the quality of the results significantly: python run.py --input-size 798 Sometimes vitb using a higher image size can produce nicer results than vitl (at the default size), while running just as fast.