DepthAnything / Depth-Anything-V2

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
https://depth-anything-v2.github.io
Apache License 2.0
3.87k stars 336 forks source link

Harsh white lines appear around center object #111

Open fewjative opened 3 months ago

fewjative commented 3 months ago

Problem: Harsh white lines surround object Expected: No white harshlines, they don't make sense from a depth POV either

I'm using the code found here: https://huggingface.co/docs/transformers/main/en/model_doc/depth_anything_v2

image_path = "cars-cache/1.jpg"
baseimage = Image.open(image_path)
self.depthpipe = pipeline(task="depth-estimation", model="depth-anything/Depth-Anything-V2-Small-hf", device=0)
depth = self.depthpipe(baseimage)["depth"]
depth.save("pipedepth.png")
 python_version: "3.10.6"

  python_packages:
    - torch==2.2.0
    - torchvision==0.17.0
    - torchaudio==2.2.0
    - torchsde
    - einops
    - transformers>=4.25.1
    - safetensors>=0.3.0
    - aiohttp
    - accelerate
    - pyyaml
    - Pillow
    - scipy
    - tqdm
    - psutil
    - websocket-client==1.6.3
    - imutils
    - requests_toolbelt
    - kornia

Running on an Nvidia 4090. Happens both locally and when I run the code on the cloud too via Replicate.

Output

pipedepth

Input image ( real photo, non AI )

1

heyoeyo commented 3 months ago

I would guess this is a rounding/value-of-out-bounds issue. So values that are too low (i.e. < 0) in this case are being wrapped around to the max value (255) which appears as that white outline.

I'm not familiar with the huggingface stuff, but judging from that code sample, I don't think there's anything you can do to fix it (the error is happening inside the 'depthpipe' code). However on that same page they have a larger Using the model yourself section that you can copy and modify to fix the problem. The main change you'd want to make is on the second last line:

# Original code
formatted = (output * 255 / np.max(output)).astype("uint8")

# Change to:
formatted = (255 * ((output - np.min(output)) / (np.max(output) - np.min(output)))).astype("uint8")

This should help account for the error of values dropping below zero causing that white line.

The other potential change is to switch the interpolation mode from "bicubic" to "bilinear" in the torch.nn.functional.interpolate(...) part. This change alone might take care of the problem in this case actually, but the first fix is the more important one, though I'd recommend both.

fewjative commented 3 months ago

Thanks for the ideas @heyoeyo

Both of the ideas you suggested ended up resolving the problem for me.