Which DepthAnything models are being compared?

Tencent / DepthCrafter

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

https://depthcrafter.github.io

Other

928 stars 43 forks source link

Which DepthAnything models are being compared? #4

Closed TouqeerAhmad closed 1 month ago

TouqeerAhmad commented 1 month ago

Hello, great work!

Can you please clarify which DepthAnything and DepthAnything-V2 models are used for comparison in Table1 of the paper? Also there is no detail on the inference speed of the model? Can you please specify how long would it take to infer 110 frame on a consumer-level gpu? To this end, how much extra cost the devised inference strategy incur for video lengths more than 110?

Looking forward to the code release!

donrikk commented 1 month ago

@TouqeerAhmad from working with the depth anything models extensively it looks to me like its either v2 base or v2 large being used for comparison, small model is much more flicker prone in clips like that to be produced without further post work.

TouqeerAhmad commented 1 month ago

@donrikk Thank you, I thought the same. Though base and large models are released under restrictive licenses so not usable generally.

donrikk commented 1 month ago

@donrikk Thank you, I thought the same. Though base and large models are released under restrictive licenses so not usable generally.

No problem! And you are correct on the licensing situation but if used for non commercial use or in other words if it’s a open source non profit project utilizing these maps then it falls within depth anything’s license restrictions. Also by using the ability to import custom depth maps into programs that are paid for u can bypass that license aswell by obtaining ur maps by way of open source and then using those maps on ur subscription based converter. Or if ur intuitive enough you can use them directly on davinci resolve. Overall the license for the base and large are pretty lenient in this sense.

TouqeerAhmad commented 1 month ago

I like this: "intuitive enough you can use them directly on davinci resolve"

donrikk commented 1 month ago

I like this: "intuitive enough you can use them directly on davinci resolve"

🤣 i hate to say it but for alot of people its a hefty task lol. but we have many tinkerers we call them in our discord including myself that try to play around and figure things out, ive got an amazing work flow going and these maps would reduce my wait time on conversions by DAYS! so im beyond excited for these maps. ive currently been blending depth map models together using some python scripts i whipped up with the help of chat gpt.

donrikk commented 1 month ago

I like this: "intuitive enough you can use them directly on davinci resolve" these are some examples from the workflow ive got going

Untitled_2 1 7

Untitled_2 1 1 Untitled_2 1 5

wbhu commented 1 month ago

Thanks for your comments, we used large model for both Depth-Anything and v2

wbhu commented 1 month ago

Hello, great work!

Can you please clarify which DepthAnything and DepthAnything-V2 models are used for comparison in Table1 of the paper? Also there is no detail on the inference speed of the model? Can you please specify how long would it take to infer 110 frame on a consumer-level gpu? To this end, how much extra cost the devised inference strategy incur for video lengths more than 110?

Looking forward to the code release!

Please check our updated README for inference speed

Davidyao99 commented 1 month ago

@wbhu I was wondering if you used the metric depth estimation version of Depth-Anything? Additionally, I was wondering if you know how does this model compare to unidepth v2

wbhu commented 1 month ago

@wbhu I was wondering if you used the metric depth estimation version of Depth-Anything? Additionally, I was wondering if you know how does this model compare to unidepth v2

no, we used the relative depth version, as our DepthCrafter also aims at relative depth

TouqeerAhmad commented 1 month ago

@donrikk I am curious about your workflow, are you demonstrating stereo rendering there using depth maps? Other than that we are in Hogwarts and releasing the snake bred in captivity -- can't say much :)

@wbhu thank you for clarifying this!