BillFSmith / TilingZoeDepth

144 stars 22 forks source link

We need v4, TilingDepth-Anything.. or even better, Upscaled TilingDepth-Anything.. (and TilingMarigold) #8

Open gituser123456789000 opened 5 months ago

gituser123456789000 commented 5 months ago

Depth-Anything looks to be the new current standard for accuracy in automated depth maps. Alone, the maps are nice, but I've already tested manual 8x tiling and it made a clearly significant upgrade.. then I tried 8x upscaling followed by 8x tiling and it made an even further significant improvement.

Doing Tiling manually though, leaves the result with very clearly visible cut lines along with non-uniform depth across the whole image (since each tile was treated as a separate image to find its own depth).

The concept needs your program's magic or making the depth map uniform and removing distortions / the fog. These Depth-Anything maps seem very clean to begin with though.. not a lot of noise that needs removing, so I think the results will look fantastic if you create this.

Upscaled Tiled Depth-Anything

Here's the project page: https://github.com/LiheYoung/Depth-Anything

jones873 commented 5 months ago

Would you mind testing your tiling on the depth maps produced from this project page, Ive tried Depth-Anything, and to be honest in my opion, its no where near on par with tiled-zoey depth, (at this moment in time) unless i`m missing something...

https://huggingface.co/spaces/toshas/marigold

gituser123456789000 commented 5 months ago

ct page, Ive tried Depth-Anything, and to be honest in my opion, its no where near on par with tiled-zoey depth, (at this m

You want me to test upscaling and splitting an image and running each piece through Marigold to compare the results of Tiled Zoe Depth vs Depth-Anything vs Marigold as far as the level of detail and accuracy they can pull out of an image with further tiling?

I can try that. I found Depth-Anything doesn't seem to do good with tree branches from the picture I've been testing with, but does very well with other things and doesn't produce the foggy, smeared look of some other depth maps. Clean, smooth, accurate.. but maybe lacking extreme detail. That's where Tiling can help though. I'll likely post examples here in the future.

Check this.. in the Owl3D discord and wherever else the creator of this page goes.. they've been compiling and comparing different depth map model results using a standard image from a game that also let them pull the exact depth from it for reference on what an ideal map should look like for the test image: https://airtable.com/appjWiS91OlaXXtf0/shrlyfglV34siHrqt/tblviBOLphAw5Befd/viw8nzDs2MbSfbnje/reczLuhpToHLi1M26

I haven't tested Marigold, because its result on that site didn't look impressive to me, but I'll test tiling with it and see how it turns out. I'll post the results eventually. Currently trying to get Depth-Anything running with xFormers, which is supposed to make it faster as far as I've read, but it ended up breaking my installation, so I'm trying to figure that out and I'm far from knowing what I'm doing with that.

gituser123456789000 commented 5 months ago

Quick follow-up.. Marigold might actually be extremely impressive with upscaling and tiling. Upscaling alone looks like basically the same as the small original test image result. It has the same strange pixelated / dotted look... but Tiling the upscaled image... Wow was my first impression. The tree branches go from that strange dotted look to highly detailed.. each individual branch. I need sleep soon, so I'll do more testing and compiling tomorrow and post the results. Thanks for recommending that I give Marigold tiling a try. The results were completely unexpected.

jones873 commented 5 months ago

Thank you

Heres what Im getting useing there einstien image, I can see it`s a bit noisy but a simple low smoothing would sort that out.

gituser123456789000 commented 5 months ago

Here's the original test image in Marigold: https://media.discordapp.net/attachments/829513391529656424/1201504191189889164/Marigold_original.png

But a huge difference with 8x upscaling and 8x tiling: (in level of detail... but in some cases, not accuracy like the trailer compared to the wheels or in the opposite depth direct, the Hummer and its tires.. but it's also not seeing the full images with the individual tiles, so that could likely be fixed by comparing to the original and making it uniform) https://media.discordapp.net/attachments/829513391529656424/1201504191613509673/Marigold_8xUpscaled_8xSplit_scaled_back_down_to_original_resolution_and_inverted_colors.png

And here's 8x Upscaled 8x Tiling for Depth-Anything vs Marigold vs Tiled ZoeDepth (Again, to be clear.. this shows potential.. what details these models can pull out of an image. The final results still need BillFSmith's magic program here to make the final result uniform.)

Depth-Anything: https://media.discordapp.net/attachments/829513391529656424/1201506495855071292/Depth-Anything_8xUpscaled_8xSplit_scaled_back_down_to_original_resolution_and_greyscaled.png

Marigold: https://media.discordapp.net/attachments/829513391529656424/1201506496203214968/Marigold_8xUpscaled_8xSplit_scaled_back_down_to_original_resolution_and_inverted_colors.png

Tiled ZoeDepth: https://media.discordapp.net/attachments/829513391529656424/1201506496496795648/TiledZeoDepth_8xUpscaled_8xSplit_scaled_back_down_to_original_resolution.png

From what I see.. Accuracy in true depth seems to clearly go to Depth-Anything, with everything seeming to be on the proper planes of depth (for example the background is actually black, the trailer and its wheels are the same shade so same depth). Depth-Anything is also much cleaner, with no 'fog' / 'haze' or 'smearing' effect, whatever you want to call it.

Tree detail is an impressive win for Marigold.

Fine detail (In general, besides the tree), goes to Tiled ZoeDepth.. but it suffers from tons of 'fog', which would mess with the depth perception. The STOP sign is also too detailed for its own good, as it should be a single plane of color as the other two correctly did.

Interesting how it's a give a take and each model has its strengths and weaknesses. For actually using these model's depth maps to make a 3D movie for example.. I'd have to choose the accuracy and smoothness of Depth-Anything for now.

It will be very interesting to see if BillFSmith can make Tiled Depth-Anything and Tiled Marigold programs.. and potentially more options for upscaling and additional tiling.

jones873 commented 5 months ago

Excellent, thanks for taking the time to test thease, at least now we have more options where we can use thease different models to get the best depthmaps.But I have to say Bills Tiled Zoeydepth is still my go to and hope he continues to update it..

cmore86 commented 5 months ago

I got this up and running locally. I have tried pretty much all the models with the tiling method and dpt_beit_large_512 seems to be the best...well sometimes. The model here from the colab is also midas - ZoeD_M12_N.pt - but the smaller large model run thru the zoedepth framework. Depthanything did not work well for details at all. Marigold so so but not as good as midas.

But damn the Midas 512 is memory hungry. I am running a 4090 64gb ram 12 core processor and when I increase the net width and height past 1500 I either get killed or out of memory...but finding the higher you go the better the detail. Sometimes midas nails it out of the park other not so good. I think zoe framework is best at guessing the depth. I am trying to figure out how to run the midas 512 through the zoe framework .On the midas framework the tiling script works great on detail but botches the depth sometimes & other times nails it. I find myself going back and forth between zoe framework and midas.

Other thing that well is a quick composite in Photoshop with a displacement map( from crazy bump, pixplant, substance designer etc) and the depthmap.

Project that would be fun to try is to fine tune Zoe. I have a ton of models I made over the years and can generate a great and true depthmap and have the image of the model but that would be a HUGE undertaking to create the dataset. But I wonder how it would work with say fine tuned with 500 depthmaps and the corresponding image all bas relief.

midaszoe

composite

jones873 commented 5 months ago

Ecellent work cmore86

Just tried your 2 main images using my own work flow, they turned out ok ish but still room for improvment, we should link up to see where this could take us, i`m in the UK....

Screen Shot 02 Untitled

cmore86 commented 5 months ago

Hey Jones 873 let's link for sure.

Ha Im down such a rabbit hole with depth estimation

jones873 commented 5 months ago

Hi cmore86

I don`t know if this is you on reddit, but you both seem to have the same ideas and pc specs, maybe you can pick up on something he has tried or pick his brain?

https://www.reddit.com/r/StableDiffusion/comments/18kv89r/test_zoe_depth_vs_midas_depth_spoiler_alert_use/?rdt=56864

cmore86 commented 5 months ago

No that's not me. But from reading kinda the same. Thing is though zoedepth uses the midas model my take its just a framework.

In terms of detail I tried midas 512 and yes it produces way better detail but the depth prediction in terms of bas relief is shit. Imo for bas relief zoedepth running midas due to a more accurate overall depth for bas relief. Detail add displacement map on top of zoedepth n its pretty good considering just a single image

At some point I'm going to try to train or fine tune with a custom data set of bas relief depth maps and corresponding images. Ha when get time to create the data set n learn fine tune and training.

Things change so fast with ai been fun to follow. I personally don't think one image will ever be as good as an actual model but who knows.

cassidyow commented 4 months ago

Hey cmore86 Firstly, I've never used linux before and I'm not a coder so I don't know much about this stuff. I tried to run the "TilingZoeDepth" local script you shared with wsl2 on ubuntu and I failed but I was able to run it with anaconda on Windows. I have a question. I want to try Midas 512. Can you tell me how to use Midas 512 with tiling method like in "TilingZoeDepth"? Thanks.

cmore86 commented 4 months ago

pretty much the exact same script but calling in the midas 512L instead of the smaller one

gituser123456789000 commented 3 months ago

@jones873 @cmore86

pretty much the exact same script but calling in the midas 512L instead of the smaller one

How exactly do we call in MiDaS dpt_beit_large_512?

Using the GUI fork for example, which just has ZoeD models...

And how would we call in Depth-Anything? and Marigold? and Metric3D v2? (To add: some or all of these are .pth instead of .pt)

Can we get an update for the GUI fork that includes support for more models?

I like what's been done with the 32bit addition. It seems to produce even higher quality details. as well as having 3 ZoeD models N, K, and NK to choose from (although I think N usually produces the better results from what I've tested so far)

Lastly, further can you add hierarchical resolution options to the GUI fork? If I'm using that terminology correctly..

gituser123456789000 commented 3 months ago

Also to add, some new recent models I see: https://depthfm.github.io/ https://fuxiao0719.github.io/projects/geowizard/

DepthFM and Geowizard

jones873 commented 3 months ago

Not to sure how @cmore86 has done it, but he has manged to create his own depthmap script and its definitely the best I have seen so far, but he has gone a bit quite, I think he is in the process of improving it.. He has sent me some depthmaps to try, but I dont know if he wants to share them at the moment. So we will all have to wait until he is ready, but I can guarantee it will be worth the wait..

gituser123456789000 commented 3 months ago

Interesting.. I look forward to seeing

gituser123456789000 commented 3 months ago

Follow-up on DepthFM direct from the project page and Geowizard. I go no great results from DepthFM. Maps were hit or miss, not consistent good looking.. and the ones the did look good, didn't perform well when actually converted to 3D. They were inaccurate or unimpressive compared to Tiling, Geowizard and Depth-Anything

Geowizard I'm getting good results with. I've only had it for a day and was testing with DepthDM first..

Geowizard is consistently good, but it's a different setup. You choose how many steps you process it for and it gets more and more accurate and detailed in general, until I'm sure there are diminishing returns. You can also adjust the processing resolution (going to high will eventually send error messages due to memory limitations)

My impressions so far are that Depth-Anything is the most accurate from front to back, but doesn't have detail in the far back of the scene, since it's completely black usually, everything in the far background is flat. Everything before that though is correct and accurate.

Tiling currently only ZoeDepth gives the best details. Very impressive details, especially with 32bit mode, but ZoeDepth inaccuracies, middle bulge effect, etc can sometimes be seen.

Geowizard first impressions seem to be second place to both of these. It can be accurate, but maybe not as accurate as Depth-Anything (and if it does become more accurate, it's with a lot more time at high settings)... And details can be good, but maybe not as good as TilingZoeDepth.. Depending from detail to detail.. background features may be better and more accurate, depending on the content.. but Tiling seems unmatched with pulling out facial features for example. TilingZoeDepth is not always perfect though, there can still be facial distortions.

Still waiting on more Tiling models. People say they can do it, but nobody is sharing how to or adding it to the GUI yet. That will be exciting to see once new models are added to the Tiling GUI, so we're not just stuck with ZoeDepth forever as the only model being Tiled.

jones873 commented 3 months ago

I just rememberd he sent me this before he sent me some depthmaps, which im sure he wont mind me sharing, and I think he was still finetuning is dataset.

Cmores86 Examples

jones873 commented 3 months ago

Well cmore86 finnally got back to me, seems his wife has got him busy doing house renovations and not had time for the project, most of us have been there!, anyway heres some of the depth-maps he has sent me sorry for the watermarks, but theres no reason why thease can be exported to stl files and sold on market places, without consent, (aimed solely at the lurkers) cpg0rxe2jomc1 f05dhye2jomc1 h7hzwxe2jossmc1 vuwvmexbkymc1

cmore86 commented 3 months ago

jones873 - Got now set up my shop set up and then be back on the programming....life always has a way of getting in the way

gituser123456789000 commented 3 months ago

@jones873

I just rememberd he sent me this before he sent me some depthmaps, which im sure he wont mind me sharing, and I think he was still finetuning is dataset.

Cmores86 Examples

This AI depth model doesn't line up with the original image though. Every detail about it is noticeably different...

The hair is different, the lines/wrinkles in the face, the eyes, the teeth, the nose, the ears are bigger, the chin, the beard, the clothes... everything.

It's a similar caricature, but not exactly true to the original source

gituser123456789000 commented 3 months ago

@jones873 @cmore86 It looks like Owl3D has implemented an improved version of TiledZoeDepth_N as their current Ultra model.

The details it pulls out are the same, but it's not as overly intense with the brightness as TilingZoeDepth and the GUI are. And it looks like the gradient fades to black better.. similar to Depth-Anything.. so it has accurate depth from front to back and eliminates the 'bulge' effect as I call it in the mid-range of a 2Dto3D converted result.

gituser123456789000 commented 3 months ago

Well cmore86 finnally got back to me, seems his wife has got him busy doing house renovations and not had time for the project, most of us have been there!, anyway heres some of the depth-maps he has sent me sorry for the watermarks, but theres no reason why thease can be exported to stl files and sold on market places, without consent, (aimed solely at the lurkers) cpg0rxe2jomc1 f05dhye2jomc1 h7hzwxe2jossmc1 vuwvmexbkymc1

The details and depth look nice. I don't know how this compares to other models though. I don't do this type of content with the depth maps.. just 2D to 3D photo and video conversion. If this translates well for use with photos and video.. it seems interesting.

cmore86 commented 3 months ago

Well cmore86 finnally got back to me, seems his wife has got him busy doing house renovations and not had time for the project, most of us have been there!, anyway heres some of the depth-maps he has sent me sorry for the watermarks, but theres no reason why thease can be exported to stl files and sold on market places, without consent, (aimed solely at the lurkers) cpg0rxe2jomc1 f05dhye2jomc1 h7hzwxe2jossmc1 vuwvmexbkymc1

The details and depth look nice. I don't know how this compares to other models though. I don't do this type of content with the depth maps.. just 2D to 3D photo and video conversion. If this translates well for use with photos and video.. it seems interesting.

Yeah mine is generative and not the exact photo. Things I am working on are just for cnc stuff...high detail for a stl model