CompVis / depth-fm

DepthFM: Fast Monocular Depth Estimation with Flow Matching
MIT License
395 stars 27 forks source link

Results converted to 3D look very inaccurate #15

Open gituser123456789000 opened 7 months ago

gituser123456789000 commented 7 months ago

Results converted to 3D look very inaccurate, unfortunately.

Attached is one of our standard test images, converted to 3D using 3D combine, default settings of 0.20, yes, yes.

many things look to be on the wrong plane. The rocks/crystals in the foreground are floating above the ground. The character appears to also be floating above the ground

It's rough on the eyes.

Other images; some outputs are poor and unusable.. other conversions suffer form the 'bulge effect' where the mid-range bulges forward. It looks like a hill where it should be flat.

Some depth maps look fantastic.. but they don't perform when actually put to use.

pandora anaglyph pandora crosseye pandora SBS

These were converted using 'binary' colormap output, equivalent to --no_color output that's been inverted to white front / black back...

Here is the original image if you don't believe the results. If you can get better results, I'd be glad to learn how to use the program better, but so far, I'm seeing many inaccuracies. DAK_Pandora_LowAngleNight_DC_v04MerchHD

gituser123456789000 commented 7 months ago

Here is TiledZoeDepth for comparison.. 32bit ZoeD_N model

In this specific image, your model has more background pop effect, but is overall much less accurate overall

DAK_Pandora_LowAngleNight_DC_v04MerchHD_TiledZoeDepth N anaglyph DAK_Pandora_LowAngleNight_DC_v04MerchHD_TiledZoeDepth N cross DAK_Pandora_LowAngleNight_DC_v04MerchHD_TiledZoeDepth N SBS

Anaglyph, crosseyed and SBS attached .

gituser123456789000 commented 7 months ago

Here is Depth-Anything.. the easiest on the eyes.. most accurate from front to back.. the only downside being that the very back is flat, lacking detail..

But see everything is very smooth, on the correct plane, there's no mid-range bulge (not applicable to this particular DepthFM image, but does occur in others), the rocks are not floating in the air, the character is not floating in the air

3dcombineDEPTHANYTHINGanaglyph

*these are best viewed full-screen, on a decent sized screen

Shanzhaguoo commented 7 months ago

Is this acquired depth the true depth? I wonder how to deduce the 3D coordinates from this depthmap?

pagepal666 commented 3 months ago

How to include the loss of three-dimensional reconstruction in the overall loss during the training process?