invoke-ai / InvokeAI

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
https://invoke-ai.github.io/InvokeAI/
Apache License 2.0
23.77k stars 2.44k forks source link

[bug]: DepthAnything has outliers #7358

Open simonfuhrmann opened 5 days ago

simonfuhrmann commented 5 days ago

Is there an existing issue for this problem?

Operating system

Linux

GPU vendor

Nvidia (CUDA)

GPU model

RTX 4070

GPU VRAM

12GB

Version number

5.4.2

Browser

Chrome

Python dependencies

No response

What happened

When running DepthAnything processor (model size "base", but other model sizes are also affected), I get a depth map with many outliers. See attached image. In particular, there are white pixels (not all at 255, some have smaller values) in places where they don't make sense.

depthanything depthanything-color

What you expected to happen

Get a depth map without outliers.

How to reproduce the problem

Create trivial workflow with "Image Primitive" -> "Depth Anything Processor" and check "save to gallery" on the depth processor.

Additional context

No response

Discord username

No response

psychedelicious commented 5 days ago

We run the depth model directly via transformers.

If I run the same Depth Anything v2 small model using the non-transformers implementation via this script, I do not get the artifacts.

This might be a bug with transformers: https://github.com/huggingface/transformers/issues/28292

hipsterusername commented 5 days ago

Hey @simonfuhrmann - are these impacting generation when you use them as a control?

I’ve not found them to be super deleterious in my use, as the outlines occur on object edges, but I’ve not done explicit testing

simonfuhrmann commented 5 days ago

I've not been using Depth Anything for control, but Stereogram generation (https://github.com/simonfuhrmann/invokeai-stereo), and for this use-case, these artifacts are super detrimental and cause unpleasant artifacts when viewing.

It's also worth nothing that these artifacts only appear in situations where there is a large depth disparity between foreground and background (such as portraits), thus not on all scenes.

hipsterusername commented 4 days ago

Makes sense for the use case! As noted above, we’ll have to look upstream for specific fixes.

An alternative to look into is seeing about enhancing depth processing to a newer model like lotus.

simonfuhrmann commented 4 days ago

I appreciate the effort of the team to look into this.

Transformers also seems to support Depth Anything v2. Is it worth giving this a shot? This is potentially the most straightforward path to updating the depth model. There is also this issue, to it would be a feed-two-birds-with-one-scone situation.

psychedelicious commented 4 days ago

@simonfuhrmann We support both actually - well, only the small v2 variant due to licensing. Both have the same issue in my testing.